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Abstract 


We give an algorithm that computes exact maximum flows and minimum-cost flows on 
directed graphs with m edges and polynomially bounded integral demands, costs, and capacities 
in m!+°() time. Our algorithm builds the flow through a sequence of m!+°() approximate 
undirected minimum-ratio cycles, each of which is computed and processed in amortized mo) 
time using a new dynamic graph data structure. 

Our framework extends to algorithms running in m!+°( time for computing flows that 
minimize general edge-separable convex functions to high accuracy. This gives almost-linear 
time algorithms for several problems including entropy-regularized optimal transport, matrix 
scaling, p-norm flows, and p-norm isotonic regression on arbitrary directed acyclic graphs. 
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1 Introduction 


The maximum flow problem and its generalization, the minimum-cost flow problem, are classic 
combinatorial graph problems that find numerous applications in engineering and scientific com- 
puting. These problems have been studied extensively over the last seven decades, starting from the 
work of Dantzig and Ford-Fulkerson, and several important algorithmic problems can be reduced 
to min-cost flows (e.g. max-weight bipartite matching, min-cut, Gomory-Hu trees). The origin of 
numerous significant algorithmic developments such as the simplex method, graph sparsification, 
and link-cut trees, can be traced back to seeking faster algorithms for max-flow and min-cost flow. 

Formally, we are given a directed graph G = (V, E) with |V| = n vertices and |E| = m edges, 
upper/lower edge capacities ut, u~ € RF, edge costs c € RË, and vertex demands d € RV with 
Xuwev dy = 0. Our goal is to find a flow f € RË of minimum cost c! f that respects edge capacities 
uz < fe < uz and satisfies vertex demands d. The vertex demand constraints are succinctly 
captured as B| f = d, where B € R¥*" is the edge-vertex incidence matrix defined as Bi(a,b),v) İS 
1 if v = a, —1 if v = b, and 0 otherwise. To compare running times, we assume that all ut, uc, ce 
and dy are integral, and juf |,ļuz| < U and |ce| < C. 

There has been extensive work on max-flow and min-cost flow. While we defer a longer dis- 
cussion of the related works to Appendix A, a brief discussion will help place our work in context. 
Starting from the first pseudo-polynomial time algorithm by Dantzig [Dan51] that ran in O(mn?U) 
time, the approach to designing faster flow algorithms was primarily combinatorial, working with 
various adaptations of augmenting paths, cycle cancelling, blocking flows, and capacity/cost scal- 
ing. A long line of work led to a running time of O(mmin{m'/?, n7/*} log U) [HK73; Kar73; ET75; 
GR98] for max-flow, and O(mnlogU) [GT87] for min-cost flow. These bounds stood for decades. 

In their breakthrough work on solving Laplacian systems and computing electrical flows, Spiel- 
man and Teng [ST04] introduced the idea of combining continuous optimization primitives with 
graph-theoretic constructions for designing flow algorithms. This is often referred to as the Lapla- 
cian Paradigm. Daitch and Spielman [DS08] demonstrated the power of this paradigm by combining 
Interior Point methods (IPMs) with fast Laplacian systems solvers to achieve an O(m! log? U) 
time algorithm for min-cost flow, the first progress in two decades. A key advantage of IPMs is 
that they reduce flow problems on directed graphs to problems on undirected graphs, which are 
easier to work with. The Laplacian paradigm achieved several successes, including O(me~!) time 
(1 + €)-approximate undirected max-flow and multicommodity flow [CKMST11; KLOS14; She13; 
Pen16; She17], and an m's+oDU'3 time algorithm for bipartite matching and unit capacity max- 
flow [Mad13; Mad16; LS20; KLS20; AMV20], and pm'+°) time unweighted p-norm minimizing 
flow for large p [KPSW19; AS20]. 

Efficient graph data-structures have played a key role in the development of faster algorithms 
for flow problems, e.g. dynamic trees [ST83]. Recently, the development of special-purpose data- 
structures for efficient implementation of [PM-based algorithms has led to progress on min-cost flow 
for some cases — including an O(m logU + n! log? U) time algorithm [BLSS20; BLNPSSSW20; 
BLLSSSW21], an O(n log U) time algorithm for planar graphs [DLY21; DGGLPSY22], and small 
improvements for general graphs, resulting in an O(m?/ 21/58 log? U) time algorithm for min- 
cost flow [BGS21; GLP21; AMV21; BGJLLPS21]. Yet, despite this progress, the best running time 
bounds in general graphs are far from linear. We give the first almost-linear time algorithm for 
min-cost flow, achieving the optimal running time up to subpolynomial factors. 


Theorem 1.1. There is an algorithm that, on a graph G = (V, E) with m edges, vertex demands, 
upper/lower edge capacities, and edge costs, all integral with capacities bounded by U and costs 
bounded by C, computes an exact min-cost flow in m!+°) log U log C time with high probability. 


Our algorithm implements a new IPM that solves min-cost flow via a sequence of slowly-changing 
undirected min-ratio cycle subproblems. We exploit randomized tree-embeddings to design new 
data-structures to efficiently maintain approximate solutions to these subproblems. 

A direct reduction from max-flow to min-cost flow gives us an algorithm for max-flow with only 
a logU dependence on the capacity range U.! 


Corollary 1.2. There is an algorithm that on a graph G with m edges with integral capacities in 
[1,U] computes a maximum flow between two vertices in time m+) log U with high probability. 


By classic capacity scaling techniques [Gab85; GT88b; AGOT92], it suffices to work with graphs 
with U,C = poly(m) to show Theorem 1.1 and Corollary 1.2. For completeness, we include our 
version of the reductions in Appendix C, as we could not find a readily citable version. 


1.1 Applications 


Our result in Theorem 1.1 has a wide range of applications. By standard reductions, it gives the first 
m!+o()) time algorithm for the bipartite matching problem and m!+°) log U log C time algorithms 
for its generalizations, including the worker assignment and optimal transport problems. 

In directed graphs with possibly negative edge weights, assuming integral weights bounded by 
W in absolute value, we obtain the first almost-linear time algorithm to compute single-source 
shortest paths and to detect a negative cycle, running in m!+°() log W time (see Appendix D for 
a reduction). In an independent work, Bernstein, Nanongkai, and Wulff-Nilsen [BNW 22] claim the 
first m- poly (log m) log W time algorithm for this problem. 

Using recent reductions from various connectivity problems to max-flow, we also obtain the first 
m+) time algorithms for various such problems, most prominently to compute vertex connectiv- 
ity and Gomory-Hu trees in undirected, unweighted graphs, and (1 + ¢)-approximate Gomory-Hu 
trees in undirected weighted graphs. We also obtain the fastest current algorithm to find the global 
min-cut in a directed graph. Finally, we obtain the first almost linear time algorithms to compute 
approximate sparsest cuts in directed graphs. We defer the discussion of these results and precise 
statements to Appendix D. 

Additionally, we extend our algorithm to compute flows that minimize general edge-separable 
convex objectives. This allows us to solve regularized versions of optimal transport (equivalently, 
matrix scaling), as well as p-norm flow problems and p-norm isotonic regression for all p € [1, oo]. 
We state an informal version of our main result Theorem 10.13 on general convex flows. 


Informal Theorem 1.3. Consider a graph G with demands d, and an edge-separable convex cost 
function cost(f) = >, coste(fe) for “computationally efficient” edge costs coste. Then in m!+°) 
time, we can compute a (fractional) flow f that routes demands d and cost(f) < cost(f*) + 
exp(— log? m) for any constant C > 0, where f* minimizes cost(f*) over flows with demands d. 


We remark that the optimal solution f* to the above convex flow problem can be non-integral, 
whereas in the case of max-flow and min-cost flow with integral demands/capacities, there exists 
an integral optimal flow. 


1.2 Key Technical Contributions 


Towards proving our results, we make several algorithmic contributions. We informally describe 
the key pieces here, and present a more detailed overview in Section 2. 


1s, t max-flow can be reduced to min cost circulation by adding a new edge t + s with lower capacity 0 and upper 
capacity mU. Set all demands to be 0. The cost of the t > s edge is —1. All other edges have zero cost. 


Our first contribution is a new potential reduction IPM for min-cost flow, inspired by [Kar84], 
that reduces min-cost flow to a sequence of m!+°) slowly-changing instances of undirected minimum- 
ratio cycle. Each instance of undirected min-ratio cycle is specified by an undirected graph where 
every edge e is assigned a positive length Ze and a signed gradient ge, and the goal is to find a 
circulation c € R®, i.e. c satisfies B! c = 0, with the smallest ratio g'¢/||Le||,, where L = diag(£) 
is the diagonal length matrix. Note that the graph is undirected in the sense that each edge can be 
traversed in either direction, and has the same length in either direction, however, the contribution 
of the edge gradient changes sign depending on the direction that the edge is traversed in. 

Below is an informal statement summarizing the IPM guarantees proven in Section 4. 


Informal Theorem 1.4 (¢; IPM Algorithm). We give an IPM algorithm that reduces solving min- 
cost flow exactly to sequentially solving m!+° instances of undirected min-ratio cycle, each up to 
an mo) approximation. Further, the resulting problem instances are “stable”, i.e. they satisfy, 1) 
the direction from the current flow to the (unknown) optimal flow is a good enough solution for each 
of the instances, and, 2) the length and gradient input parameters to the instances change only for 
an amortized m°) edges every iteration. 


The standard IPM approach reduces min-cost flow to solving O(./m) instances of electrical flow, 
which is an £2 minimization problem, to constant accuracy. At the cost of solving a larger number 
of resulting subproblems, our algorithm offers several advantages — undirected min-ratio cycle is 
an 41 minimization problem which is hopefully simpler (e.g. note that the optimal solution must 
be a simple cycle) and we can afford a large m°) approximation factor in the subproblems. Most 
analogous to our approach is an early interior point method by [WZ92]? which solved minimum 
cost flow using (exact) 41 min-ratio cycle subproblems. Their subproblems, however, do not satisfy 
the stability guarantees that are essential for our approach to quickly solving the subproblems 
approximately. Our IPM is robust to updates with much worse approximation factors than those 
required in the the recent works on robust interior point methods ({CLS19] and many later works) 
and establishes a different notion of stability w.r.t. gradients, lengths, and solution witnesses. This 
perspective may be of independent interest. 

In contrast to most IPMs that work with the log barrier, our IPM uses a power barrier which 
aggressively penalizes constraints that are very close to being violated, more so than the usual log 
barrier. This ensures polylogarithmic bit-complexity throughout our algorithm. 

Since a large approximation suffices, one can use a probabilistic low stretch spanning tree T 
[AKPW95; AN19] computed with respect to the lengths £ and use a fundamental tree cycle to find 
an O(1) approximate solution in time O(m) (see Section 2.2). However, the changes to gradient 
and lengths by the IPM due to the flow updates during the IPM iterations forces us to compute a 
new probabilistic low stretch spanning tree T” with respect to the new edge lengths. But computing 
a new tree in time 2(m) per iteration results in much too large a runtime. 

Our approach instead rebuilds only parts of the probabilistic low-stretch spanning tree after each 
IPM iteration to adapt to the changes in lengths. To implement this, we design a data structure 
which maintains a recursive sequence of instances of the min-ratio cycle problem on graphs with 
fewer vertices and fewer edges. These smaller instances give worse approximate solutions, but are 
cheaper to maintain. We use a j-tree style approach [Mad10] where we interleave vertex reduction 
by partial embeddings into trees with edge reductions via spanners, and exploit the stability of the 
IPM. However, using a j-tree as in [Mad10] naively still requires m!+°™ time per instance. Our 
second contribution is to push this approach much further, to give a randomized data structure 
that can return m°!) approximate solutions to all m!+°) undirected min-ratio cycle instances 


?We thank an attentive reader for making us aware of this connection. 


generated by the IPM in m!+°@) total time. Our approach leads to a strong form of a dynamic 
vertex sparsifier (in the spirit of [CGHPS20]). The stability of the instances generated by our IPM 
algorithm is essential to achieve low amortized time per instance. 


Informal Theorem 1.5 (Hidden Stable-Flow Chasing. Theorem 6.2). We design a randomized 
data structure for approximately solving a sequence of “stable” (as defined in Informal Theorem 1.4) 
undirected min-ratio cycle instances. The data structure maintains a collection of m°™ spanning 
trees and supports the following operations with high probability in amortized m°®) time: 1) Return 
an m°)-approximate min-ratio cycle (implicitly represented as the union of m?) off-tree edges and 
tree paths on one of the maintained trees), 2) route a circulation along such a cycle 3) insert/delete 
edge e, or update ge and Lle, and 4) identify edges that have accumulated significant flow. 


To achieve efficient edge reduction over the entire sequence of subproblems, we give an algorithm 
that can efficiently maintain a spanner of a given graph (a sparse subgraph that can embed the 
original graph using short paths) with explicit embeddings under edge deletions/insertions and 
vertex splits. Removing edges can completely destroy the min-ratio cycles in the graph. However, 
in that case, we can find a good approximate min-ratio cycle using the removed edges along with 
their explicit spanner embeddings. This spanner is our third key contribution. 


Informal Theorem 1.6 (Dynamic Spanner w/ Embeddings. Theorem 5.1). We give a random- 
ized data-structure that for an unweighted, undirected graph G undergoing edge updates (inser- 
tions/deletions/vertex splits), maintains a subgraph H with O(n) edges, along with an explicit path 
embedding of every e € G into H of length m°“). The amortized number of edge changes in H is 
m°!) for every edge update. Moreover, the set of edges that are embed into a fixed edge e € H is 
decremental for all edges e, except for an amortized set of m°™) edges per update. 

This algorithm can be implemented efficiently. 


By designing a spanner which changes very little under input graph modifications including 
edge insertions/deletions and vertex splits, we make it possible to dynamically combine edge and 
vertex sparsification very efficiently, even in a recursive construction. 

Finally, note that our data-structures for hidden stable-flow chasing and spanner maintenance 
are utilized to efficiently implementing the 2; IPM algorithm. Thus, the subsequent undirected min- 
ratio cycle instances can change depending on the approximately optimal cycles returned by our 
algorithm. In the terminology of dynamic graph algorithms, the sequence of undirected min-ratio 
cycle problems we need to solve is not oblivious (to the answers returned by the algorithm). This 
adaptivity creates significant additional challenges for the data-structures that need addressing. 


1.3 Paper Organization 


The remainder of the paper is organized as follows. In Section 2 we elaborate on each major piece 
of our algorithm: the @;-IPM based on undirected minimum-ratio cycles, the construction of the 
data structure for maintaining undirected minimum-ratio cycles for “stable” update sequences, and 
a spanner with explicit path embeddings in dynamic graphs. In Section 3 we give the preliminaries. 

The algorithm to obtain our main result (Theorem 1.1), the min-cost flow algorithm, is given on 
pages 24-72 in Sections 4-9, with some omitted proofs in Appendix B. The rest of the paper addresses 
generalization to convex costs, connections to the broader flow literature, and applications. 

In Section 4 we give an iterative method which shows that a minimum cost flow can be computed 
to high accuracy in m!+°() iterations, each of which augments by a m°)-approximate undirected 
minimum-ratio cycle. In Section 5 we construct our dynamic spanner with path embeddings. 
The goal of Sections 6 to 8 is to show our main data structure (Theorem 6.2) for maintaining 


undirected minimum-ratio cycles. Section 6 sets up the framework for describing “stable” update 
sequences, and describes the main data structure components. Section 7 formally constructs the 
data structure modulo a technical issue, which we resolve by introducing and solving the rebuilding 
game in Section 8. In Section 9 we combine all the pieces we have developed to give a min-cost 
flow algorithm running in time m!*°), 

In the last part of the paper, Section 10, we extend the IPM analysis to handle general 
edge-separable convex, nonlinear objectives, such as normed flows, isotonic regression, entropy- 
regularized optimal transport, and matrix scaling. 

The appendix contains an overview of previous max-flow and min-cost flow approaches in Ap- 
pendix A, omitted proofs in Appendix B, a proof of capacity scaling for min-cost flows in Ap- 
pendix C, and an extensive description of applications of our algorithms in Appendix D. 


2 Overview 


In this section, we give a technical overview of the key pieces developed in this paper. Section 2.1 
describes an optimization method based on interior point methods that reduces min-cost flow to 
a sequence of m!+°) undirected minimum-ratio cycle computations. In particular, we reduce the 
problem to computing approximate min-ratio cycles on a slowly changing graph. This can be 
naturally formulated as a data structure problem of maintaining min-ratio cycles approximately on 
a dynamic graph. 

We build a data structure for solving this dynamic min-ratio cycle problem and solve it with 
m°) amortized time per cycle update for our IPM, giving an overall running time of m!+°), 
Section 2.2 gives an overview of our data structure for this dynamic min-ratio cycle problem, with 
pointers to the rest of the overview which provides a more in-depth picture of the construction. The 
data structure creates a recursive hierarchy of graphs with fewer and fewer vertices and edges. In 
Section 2.3 we describe how to reduce the number of vertices, before describing the overall recursive 
data structure in Section 2.4. Naively, the resulting data structure works only against oblivious 
adversaries where updates and queries to the data structure are fixed beforehand. We cannot utilize 
it directly because the optimization routine updates the dynamic graph based on past outputs from 
the data structure. Therefore, the cycles output by the data structure may not be good enough 
to make progress. Section 2.5 discusses the interaction between the optimization routine and the 
data structure when we directly apply it. It turns out one can leverage properties of the interaction 
and adapt the data structure for the optimization routine. Section 2.6 presents an online algorithm 
that manipulates the data structure so that it always outputs cycles that are good enough to make 
progress in the optimization routine. Finally, the overview ends with Section 2.7 which gives an 
outline of our dynamic spanner data structure. We use this spanner to reduce the number of edges 
at each level of our recursive hierarchy, one of the main algorithmic elements of our data structure. 


2.1 Computing Min-Cost Flows via Undirected Min-Ratio Cycles 
The goal of this section is to describe an optimization method which computes a min-cost flow on 
a graph G = (V, E) in m+) computations of m°)-approximate min-ratio cycles: 
T 
. g A 
min —— 1 
Bato [LAT e) 


for gradient g € R” and lengths L = diag(£) for £ € RE). Note that the value of this objective is 
negative, as — A is a circulation if A is. 


Towards this, we work with the linear-algebraic setup of the min-cost flow problem: 


fre arg min c'f (2) 
B! f=d 
Ue <fe<ut for all e€ E 


for demands d € RË, lower and upper capacities u`, ut € R”, and cost vector c € RY. Our goal 
is to compute an optimal flow f*. Let F* = c! f* be the optimal cost. 

Our algorithm is based on a potential reduction interior point method [Kar84], where each 
iteration we reduce the value of the potential function 


D(f) = 20mlog(c' f — F*) + > (ut — fe) “+ (fe — us) *) @) 


ecE 


for a = 1/(1000log mU). The reader can think of the barrier x~° as the more standard — log x 
for simplicity instead. We use x~° to ensure that all lengths/gradients encountered during the 
algorithm can be represented using O(1) bits, and explain why this holds later in the section. 
When ®(f) < —200mlogmU, we can terminate because then c! f — F* < (mU)~!°, at which 
point standard techniques let us round to an exact optimal flow [DS08]. Thus if we can reduce the 
potential by m~°) per iteration, the method terminates in m!+°@) iterations. 

Previous analyses of IPMs used £2 subproblems, i.e. replacing the 41 norm in (1) with an %2 
norm, which can be solved using a linear system. [Kar84] shows that using ¢2 subproblems such a 
method converges in O(m) iterations. Later analyses of path-following IPMs [Ren&8] showed that 
a sequence of Olym) £2 subproblems suffice to give a high-accuracy solution. Surprisingly, we are 
able to argue that a solving sequence of O(m) 41 minimizing subproblems of the form in (1) suffice 
to give a high accuracy solution to (2). In other words, changing the £2 norm to an ¢; norm does 
not increase the number of iterations in a potential reduction IPM. The use of an ¢;-norm-based 
subproblem gives us a crucial advantage: Problems of this form must have optimal solutions in the 
form of cycles—and our new algorithm finds approximately optimal cycles vastly more efficiently 
than any known approaches for £2 subproblems. 

There are several reasons we choose to use a potential reduction IPM with this specific potential. 
The most important reason is the flexibility of a potential reduction IPM allows our data structure 
for maintaining solutions to (1) to have large m°“) approximation factors. This contrasts with 
recent works towards solving min-cost flow and linear programs using a robust IPM (see [CLS19] 
or the tutorial [LV21]), which require (1 + o(1))-approximate solutions for the iterates. 

Finally, we use the barrier x~“ as opposed to the more standard logarithmic barrier in order to 
guarantee that all lengths/gradients encountered during the method are bounded by exp(log? m) 
throughout the method. This follows because if (uf — fe)“ < O(m), then 


ut — fe > O(m)~/% > exp(—O(log? Um)). 
Such a guarantee does not hold for the logarithmic barrier.’ 

To conclude, we discuss a few specifics of the method, such as how to pick the lengths and 
gradients, and how to prove that the method makes progress. Given a current flow f we define the 
gradient and lengths we use in (1) as g(f) = V®(f) and &(f). = (ut — fo +(fe — uz) e. 
Now, let A be a circulation with g(f)'A/||LA]|, < —« for some « < 1/100, scaled so that 
LA|]; = «/50. A direct Taylor expansion shows that ®(f + A) < ®(f) — «7/500 (Lemma 4.4). 


3The reason that path-following IPMs for max-flow [DS08] do not encounter this issue is because one can show 
that primal-dual optimality actually guarantees that the lengths/resistances are polynomially bounded. We do not 
maintain any dual variables, so such a guarantee does not hold for our algorithm. 


Hence it suffices to show that such a A exists with k = Q(1), because then a data structure 
which returns an m?)-approximate solution still has x = m~°“), which suffices. Fortunately, the 
witness circulation A(f)* = f* — f satisfies g(f)' A/||LA]|, < —Q(1) (Lemma 4.7). 

We emphasize that the fact that f* — f is a good enough witness circulation for the flow f is 
essential for proving that our randomized data structures suffice, even though the updates seem 
adaptive. At a high level, this guarantee helps because even though we do not know the witness 
circulation f* — f, we know how it changes between iterations, because we can track changes in f. 
We are able to leverage such guarantees to make our data structures succeed for the updates coming 
from the IPM. To achieve this, we end up carefully designing our adversary model with enough 
power to capture our IPM, but with enough restrictions that our min-ratio cycle data structure to 
win against the adversary. We elaborate on this point in Sections 2.2 and 2.5. 


2.2 High Level Overview of the Data Structure for Dynamic Min-Ratio Cycle 


As discussed in the previous section, our algorithm computes a min-cost flow by solving a sequence of 
m'+e() min-ratio cycle problems mingt a-o g! A/||LA||; tom?) multiplicative accuracy. Because 
our IPM ensures stability for lengths and gradients (see Lemmas 4.9 and 4.10), and is even robust to 
approximations of lengths and gradients, we can show that over the course of the algorithm we only 
need to update the entries of the gradients g and lengths £ at most m!+°(1) total times. Efficiency 
gains based on leveraging stability has appeared in the earliest works on efficiently maintaining 
IPM iterates [Kar84; Vai90] as well as most recent progress on speeding up linear programs. 


Warm-Up: A Simple, Static Algorithm. A simple approach to finding an O(1)-approximate 


min-ratio cycle is the following: given our graph G, we find a probabilistic low stretch spanning tree 


rye det P feTu n] LF) 
e ~~ L(e) 


where T[u, v] is the unique path from u to v along the tree T, is O(1) in expectation. Such a tree 
can be found in O(m) time [AKPW95; AN19]. 

Let A* be the witness circulation that minimizes (1), and assume wlog that A* is a cycle that 
routes one unit of flow along the cycle. We assume for convenience, that edges on A* are oriented 
along the flow direction of A*, i.e. that A* € R£,. Then, for each edge e = (u,v) on the cycle A*, 
the fundamental tree cycle of e in T denoted e @ T[v, u], representing the cycle formed by edge e 
concatenated with the path in T from its endpoint v to u. To work again with vector notation, we 
denote by p(e @ T[v, u]) € RË the vector that sends one unit of flow along the cycle e @ T[v, u] in 
the direction that aligns with the orientation of e. A classic fact from graph theory now states that 
A* = Vie:azs0 Ac: ple 8 T[v, ul) (note that the tree-paths used by adjacent off-tree edges cancel 


T, i.e., a tree such that for each edge e = (u,v) € G, the stretch of e, defined as str 


out, see Figure 1). In particular, this implies that g' A* = Ve:Azs0 Ae -g' ple B T[v, ul). 

This fact will allow us to argue that with probability at least 5; one of the tree cycles is an O(1)- 
approximate solution to (1). Therefore, since the stretch str?" of edges e € E is small in expectation, 
we can, by Markov’s inequality, argue that with probability at least 7 the circulation A* is not 
stretched by too much. Formally, we have that )e.a*s0 Ac: |L ple 6 T[v, u) < y|LA*||1 for 


y = O(1). Combining our insights, we can thus derive that 


giA* [1 YeasoAc-g'pe@Tlvu)) <1 nin 2 Pe ® Tl.) 
ILA*h ~ y eass Až: IL ple 8 T[v, u) ~ y eAs>0 ||L ple S Tl, uJ) lI 
Lei Ti 


- (recall also that g' A* 


where the last inequality follows from the fact that min;ejn] 7 < 
icfn] 77 


io 


edges 
i i tree T 


circulation A* 


tree cycles A? p(e $ Tv, u]) 


Figure 1: Illustrating the decomposition A* = Y°..a+.9 A% : p(e $ Tv, u]) of a circulation into 
tree cycles given by off-trees and the corresponding tree paths. 


is negative). But this exactly says that for the edge e minimizing the expression on the right, the 
tree cycle e © T[v, u] is a y-approximate solution to (1), as desired. 

Since the low stretch spanning tree T stretches circulation A* reasonably with probability at 
least 5; we could boost the probability by sampling O(1) trees T1, T2,..., Ts independently at 
random and conclude that w.h.p. one of the fundamental tree cycles gives an approximate solution 
to (1). 

Unfortunately, after updating the flow f to f’ along such a fundamental tree cycle, we cannot 
reuse the set of trees T1, T>, ...,Ts because the next solution to (1) has to be found with respect 
to gradients g(f’) and lengths €(f’) depending on f’ (instead of g = g(f) and £ = @(f)). But 
g(f') and €(f’) depend on the randomness used in trees T1, T>, ..., Ts. Thus, naively, we have to 
recompute all trees, spending again Q(m) time. But this leads to run-time Q(m7?) for our overall 
algorithm which is far from our goal. 


A Dynamic Approach. Thus we consider the data structure problem of maintaining an m2) 
approximate solution to (1) over a sequence of at most m+) changes to entries of g,. To 
achieve an almost linear time algorithm overall, we want our data structure to have an amortized 
m) update time. Motivated by the simple construction above, our data structure will ultimately 
maintain a set of s = m°“) spanning trees T1, ..., T, of the graph G. Each cycle A that is returned 
is represented by m?!) off-tree edges and paths connecting them on some T;. 

To obtain an efficient algorithm to maintain these trees T;, we turn to a recursive approach. In 
each level of our recursion, we first reduce the number of vertices, and then the number of edges in 
the graphs we recurse on. To reduce the number of vertices, we produce a core graph on a subset of 
the original vertex set, and we then compute a spanner of the core graph which reduces the number 
of edges. Both of these objects need to be maintained dynamically, and we ensure they are very 
stable under changes in the graphs at shallower levels in the recursion. In both cases, our notion 
of stability relies on some subtle properties of the interaction between the data structure and the 
hidden witness circulation. 

We maintain a recursive hierarchy of graphs. At the top level of our hierarchy, for the input 
graph G, we produce B = O(log n) core graphs. To obtain each such core graph, for each i € [B], we 
sample a (random) forest F; with O(m/k) connected components for some size reduction parameter 
k. The associated core graph is the graph G/F; which denotes G after contracting the vertices in 


the same components of F;. We can define a map that lifts circulations A in the core graph G/F;, to 
circulations A in the graph G by routing flow along the contracted paths in F;. The lengths in the 
core graph £ (again let L = diag(@)) and are chosen to upper bound the length of circulations when 
mapped back into G such that |[LA|]1 > ||LA\|1. Crucially, we must ensure these new lengths £ do 
not stretch the witness circulation A* when mapped into G/F; by too much, so we can recover it 
from G/F;. To achieve this goal, we choose F; to be a low stretch forest, i.e. a forest with properties 
similar to those of a low stretch tree. In Section 2.3, we summarize the central aspects of our core 
graph construction. 

While each core graph G/F; now has only O(m/k) vertices, it still has m edges which is too 
large for our recursion. To overcome this issue we build a spanner S(G, F;) on G/F; to reduce the 
number of edges to O(m /k), which guarantees that for every edge e = (u,v) that we remove from 
G/F; to obtain S(G, F;), there is a u-to-v path in S(G, F;) of length m°™. Ideally, we would now 
recurse on each spanner S(G, F;), again approximating it with a collection of smaller core graphs 
and spanners. However, we face an obstacle: removing edges could destroy the witness circulation, 
so that possibly no good circulation exists in any S(G, F;). To solve this problem, we compute an 
explicit embedding Ilg/p7-.5(¢,n,) that maps each edge e = (u,v) € G/F; to a short u-to-v path 
in S(G, F;). We can then show the following dichotomy: Let A( f)* denote the witness circulation 
when mapped into the core graph G/F;. Then, either one of the edges e € Eqp, \ Es(g.m) has a 


spanner cycle consisting of e combined with Ilg/p_.s(G,m,)(e) which is almost as good as A(f)*, or 


re-routing A( f)* into S(G, F;) roughly preserves its quality. Figure 2 illustrates this dichotomy. 
Thus, either we find a good cycle using the spanner, or we can recursively find a solution on S (G, Fj) 
that almost matches A(f)* in quality. To construct our dynamic spanner with its strong stability 
guarantees under changes in the input graph, we use a new approach that diverges from other 
recent works on dynamic spanners; we give an outline of the key ideas in Section 2.7. 

Our recursion uses d levels, where we choose the size reduction factor k such that k? ~ m and 
the bottom level graphs have m°) edges. Note that since we build B trees on G and recurse on the 
spanners of G/F), G/F2,...,G/F, our recursive hierarchy has a branching factor of B = O(log n) 
at each level of recursion. Thus, choosing d < /logn, we get B4 = m2) leaf nodes in our recursive 
hierarchy. Now, consider the forests Fi, Fi,,...,/, on the path from the top of our recursive 
hierarchy to a leaf node. We can patch these forests together to form a tree associated with the leaf 
node. Each of these trees, we maintain as a link-cut tree data structure. Using this data structure, 
whenever we find a good cycle, we can route flow along it and detect edges where the flow has 
changed significantly. The cycles are either given by an off-tree edge or a collection of m°“) off-tree 
edges coming from a spanner cycle. We call the entire construction a branching tree chain, and in 
Section 2.4, we elaborate on the overall composition of the data structure. 

What have we achieved using this hierarchical construction compared to our simple, static 
algorithm? First, consider the setting of an oblivious adversary, where the gradient and length 
update sequences and the optimal circulation after each update is fixed in advance. In this setting, 
we can show that our spanner-of-core graph construction can survive through m!~°) /k* updates at 
level i. Meanwhile, we can rebuild these constructions in time m!+°) /k*-", leading to an amortized 
cost per update of km?) < m?) at each level. This gives the first dynamic data structure for our 
undirected min-ratio problem with m°) query time against an oblivious adversary. 

However, our real problem is harder: the witness circulation in each round is A(f)* = f* — f 
and depends on the updates we make to f, making our problem adaptive. Instead of modelling 
our IPM as giving rise to a fully-dynamic problem against an adaptive adversary, the promise that 
the witness circulation can always be written as f* — f lets us express the IPM with an adversary 
that is much more restricted. Our data structure needs to ensure that the flow f* — f is stretched 


G/F; and spanner S(G, Fi) A eR \E 
“ i 2 € LGJ F; \ ES(G,F;) 


spanner edges 
spanner embedding paths IIG/F;,—>5S(G,F;) 


circulation A(f)* 
spanner cycles 


spanner S(G, F;) ANA at circulation A(f)* 
a yO i embedded into spanner S(G, F;) 


Figure 2: Illustration of a dichotomy: either one of the edges e € Eg/p, \ Esiq,r,) has a spanner 
cycle consisting of e combined with Ug/m-,s(G,7,)(e) which is almost as good as A(f)*, or re- 
routing A( f)“ into S(G, F;) roughly preserves its quality. 


by m° on average w.r.t. the lengths £. At a high level, we achieve this by forcing the forests 
at every level to have stretch 1 on edges where fe changes significantly and could affect the total 
stretch of our data structure on f* — f. Section 2.5 describes the guarantees we achieve using this 
strategy. However, the data structure at this point is not yet guaranteed to succeed. Instead, we 
very carefully characterize the failure condition. In particular, to induce a failure, the adversary 
must create a situation where the current value of ||LA(f)*||1 is significantly less than the value 
when the levels of our data structure were last rebuilt. This means we can counteract from this 
failure by rebuilding the data structure levels. Due to the high cost of rebuilding the shallowest 
levels of the data structure, naively rebuilding the entire data structure is much too expensive, and 
we need a more sophisticated strategy. We describe this strategy in Section 2.6, where we design a 
game that expresses the conflict between our data structure and the adversary, and we show how 
to win this game without paying too much runtime for rebuilds. 


2.3 Building Core Graphs 


In this section, we describe our core graph construction (Definition 6.7), which maps our dynamic 
undirected min-ratio cycle problem on a graph G with at most m edges and vertices into a problem 
of the same type on a graph with only O(m/k) vertices and m edges, and handles O(m/k) updates 
to the edges before we need to rebuild it. Our construction is based on constructing low-stretch 
decompositions using forests and portal routing (Lemma 6.5). We first describe how our portal 
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routing uses a given forest F to construct a core graph G/F’. We then discuss how to use a collection 
of (random) forests F,,..., Fp to produce a low-stretch decomposition of G, which will ensure that 
one of the core graphs G/F; preserves the witness circulation well. Portal routings played a key 
role in the ultrasparsifiers of [ST04] and has been further developed in many works since. 


Forest Routings and Stretches. To understand how to define the stretch of an edge e with 
respect to a forest F, it is useful to define how to route an edge e in F. Given a spanning forest 
F, every path and cycle in Œ can be mapped to G/F naturally (where we allow G/F to contain 
self-loops). On the other hand if every connected component in F is rooted, where root’ denotes 
the root corresponding to a vertex u € V, we can map every path and cycle in G/F back to G 
as follows. Let P = e1,...,e% be any (not necessarily simple) path in G/F where the preimage of 
every edge e; is ef = (u%,v%) € G. The preimage of P, denoted PŪ, is defined as the following 
concatenation of paths: 


k 
pees PB F[root*c, uf] ® ef ® Flt, roothal, 
i=1 : ‘ 


where we use A® B to denote the concatenation of paths A and B, and Ffa, b] to denote the unique 
ab-path in the forest F. When P is a circuit (i.e. a not necessarily simple cycle), PG is a circuit in 
G as well. One can extend these maps linearly to all flow vectors and denote the resulting operators 
as Ip: REC) — R#(G/F) and i : RE(G/F) — RE@). Since we let G/F have self-loops, there is 
a bijection between edges of G and G/F and thus Ip acts like the identity function. 

To make our core graph construction dynamic, the key operation we need to support is the 
dynamic addition of more root nodes, which results in forest edges being deleted to maintain the 
invariant each connected component has a root node. Whenever an edge is changing in G, we 
ensure that G/F approximates the changed edge well by forcing both its endpoints to become root 
notes, which in turn makes the portal routing of the new edge trivial and this guarantees its stretch 
is 1. An example of this is shown in Figure 3. 

For any edge ef = (u@, v@) in G with image e in G/F, we set E, the edge length of e in G/F, 
to be an upper bound on the length of the forest routing of e, i.e. the path Flroot®,, uf] Bele 
Fw, root! ]. Meanwhile, we define str. = OF Jle, as an overestimate on the stretch of e w.r.t. the 
forest routing. A priori, it is unclear how to provide a single upper bound on the stretch of every 
edge, as the root nodes of the endpoints are changing over time. Providing such a bound for every 
edge is important for us as the lengths in G/F could otherwise be changing too often when the 
forest changes. We guarantee these bounds by scheme that makes auxiliary edge deletions in the 
forest in response to external updates, with these additional roots chosen carefully to ensure the 
length upper bounds. 

Now, for any flow f in G/F, its length in G/F is at least the length of its pre-image in G, i.e. 
es), < R Let A* be the optimal solution to (1). We will show later how to build F 


such that [Era] < 7||LA*||, holds for some y = m°, solving (1) on G/F with edge length 


£ and properly defined gradient g on G/F yields an +-approximate solution for G. The gradient 
g is defined so that the total gradient of any circulation A on G/F and its preimage II; A inG 
is the same, ie. g'A = g'Ip A. The idea of incorporating gradients into portal routing was 
introduced in [KPSW19]; our version of this construction is somewhat different to allow us to make 
it dynamic efficiently. 
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Figure 3: Illustration of the core graph G/F changing as an edge is deleted in G (and in F). 
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Collections of Low Stretch Decompositions (LSD). The first component of the data struc- 
ture is constructing and maintaining forests of F that form a Low Stretch Decomposition (LSD) of 
G. Variations of which (such as j-trees) have been used to construct several recursive graph pre- 
conditioners [Mad10; She13; KLOS14; CPW21] and dynamic algorithms [CGHPS20]. Informally, 
a k-LSD is a rooted forest F C G that decomposes G into O(m/k) vertex disjoint components. 
Given some positive edge weights v € RE, and reduction factor k > 0, we compute a k-LSD F and 
length upper bounds LF of G /F that satisfy two properties: 


I str. = E Jla = O(k) for any edge ef € G with image e in G/F, and 
2. The weighted average of str. w.r.t. v is only O(1), ie. Decca vec - stro < O(1) - lloll. 


Item 1 guarantees that the solution to (1) for G/F yields a O(k)-approximate one for G. However, 
this guarantee is not sufficient for our data structure, as our B-branching tree chain has d ~ log, m 
levels of recursion and the quality of the solution from the deepest level would only be O(k)4 z 
m!+o(1)_approximate. 

Instead, like [Mad10; She13; KLOS14] we compute k different edge weights v1,..., vx, via multi- 
plicative weight updates (Lemma 6.6) so that the corresponding LSDs F},..., Fp have O(1) average 
stretch on every edge in G: Da str. = O(k), for all ef € G with image e in G/F. 

E5 f| < O(1) |Lf||, holds for at least 


half the LSDs corresponding to F\,..., Fk. Taking O(1) samples uniformly from F\,...,F%, say 
F,,..., Fg for B = O(1) we get that with high probability 


By Markov’s inequality, for any fixed flow f in G, 


min | str’? o LA* 
je[B] 


, < OQ) ILA". (4) 


That is, it suffices to solve (1) on G/F,,...,G/Fp to find an O(1)-approximate solution for G. 
We provide all details including definitions and construction of the core graph in Section 6. 


2.4 Maintaining a Branching Tree Chain 


The goal of this section is to elaborate on how we combine core graphs and spanners to produce our 
overall data structure for our undirected min-ratio cycle problem, the B-branching tree chain. We 
also describe how the data structure is maintained under dynamic updates, which is more formally 
shown in Section 7. A central reason our hierarchical data structure works is that the components, 
both core graphs and spanners, are designed to remain very stable under dynamic changes to the 
input graphs they approximate. In the literature on dynamic graph algorithms, this is referred to 
as having low recourse. 


1. Sample and maintain B = O(logn) k-LSDs F, Fo,..., Fp, and their associated core graphs 
G/F;. Over the course of O(m/k) updates at the top level, the forests F; are decremental, 
i.e. only undergo edge deletions (from root insertions), and will have O(m/k) connected 
components. 


2. Maintain spanners S(G, F;) of the core graphs G/F;, and embeddings Up(g/r,)+s5(G,F,)s 8aY 
with length increase yp = m°. 


3. Recursively process the graphs S(G, F;), i.e. maintains LSDs and core graphs on those, and 
spanners on the contracted graphs, etc. Go for d total levels, for k? = m. 
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4. Whenever a level i accumulates m/k’ total updates, hence doubling the number of edges in 
the graphs at that level, we rebuild levels 7,i+1,...,d. 


Recall that on average, the LSDs stretch lengths by O(1), and the spanners S(G, F;) stretch lengths 
by ye. Hence the overall data structure stretches lengths by O(y)4 = mo) (for appropriately 
chosen d). 

We now discuss details on how to update the forests G/F; and spanners S(G, F;). Intuitively, 
every time an edge e = (u,v) is changed in G, we will delete O(1) additional edges from F;. This 
ensures that no edge’s total stretch/routing-length increases significantly due to the deletion of e 
(Lemma 6.5). As the forest F; undergoes edge deletions, the graph G/F; undergoes vertex splits, 
where a vertex has a subset of its edges moved to a newly inserted vertex. Thus, a key component 
of our data structure is to maintain spanners and embeddings of graphs undergoing vertex splits (as 
well as edge insertions/deletions). It is important that the amortized recourse (number of changes) 
to the spanner S(G, F;) is m° independent of k, even though the average degree of G/F; is Q(k), 
and hence on average 2(k) edges will move per vertex split in G/F;. We discuss the more precise 
guarantees in Section 2.7. 

Overall, let every level have recourse yp = m2) (independent of k) per tree. Then each 
update at the top level induces O(B7,)4 (as each tree branches into B trees) updates in the data 
structure overall. Intuitively, for the proper choice of d = w(1), both the total recourse O(By;)4 
and approximation factor Oly)? are m°®) as desired. 


2.5 Going Beyond Oblivious Adversaries by using IPM Guarantees 


The precise data structure in the previous section only works for oblivious adversaries, because we 
used that if we sampled B = O(logn) LSDs, then whp. there is a tree whose average stretch is 
O(1) with respect to a fired flow f. However, since we are updating the flow along the circulations 
returned by our data structure, we influence future updates, so the optimal circulations our data 
structure needs to preserve are not independent of the randomness used to generate the LSDs. To 
overcome this issue we leverage the key fact that the flow f* — f is a good witness for the min-ratio 
cycle problem at each iteration. r 

Lemma 4.7 states that for any flow f, g(f)' A(f)/(100m + ||L(f)A(F)||,) < —Q(1) holds 
where A(f) = f* — f. Then, the best solution to (1) among the LSDs G/F\,...,G/Fg maintains 
an O(1)-approximation of the quality of the witness A(f) = f* — f as long as 


min |EP AP] < ÖD ILAP) + O(n), (5) 
JEB] 1 


In this case, let A be the best solution obtained from G/F\,...,G/Fg. We have 


gf) A o(f)' ACF) A. 


[ena], ODEDA +00) — 


The additive O(m) term is there for a technical reason discussed later. 

To formalize this intuition, we define the width w(f) of A(f) as w(f) = 100-1+ |L(f)A(f)]. 
The name comes from the fact that w(f)e is always at least |€(f)-(f2 — f-)| for any edge e. We 
show that the width is also slowly changing (Lemma 9.2) across IPM iterations, in that if the width 
changed by a lot, then the residual capacity of e must have changed significantly. This gives our 
data structure a way to predict which edges’ contribution to the length of the witness flow f* — f 
could have significantly increased. 
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Observe that for any forest Fj in the LSD of G, we have [Esa], = | str’? o w(f)||, . Thus, 
we can strengthen (5) and show that the IPM potential can be decreased by m~°) if 


Fj 


ow(f)| < ÖD wP (6) 


min | str 
je[B] 

(6) also holds with w.h.p if the collection of LSDs are built after knowing f. However, this does 

not necessarily hold after augmenting with A, an approximate solution to (1). 

Due to stability of w(f), we have w(f + A). ~ w(f)e for every edge e whose length does 
not change a lot. For other edges, we update their edge length and force the stretch to be 1, 
i.e. str’? = 1 via the dynamic LSD maintenance, by shortcutting the routing of the edge e at its 
endpoints. This gives that for any j € [B], the following holds: 

Fj 


str? o wf + a|, < r o wA, + w + Aah. 


Fj 


Using the fact that min;ejp] | str o wP < O(1) ||w(f)||,, we have the following: 


“ow(f+A)l SÕDA + lwl +A). 


min | str 
je[B] 

Thus, solving (1) on the updated G/F,...,G/Fg yields a good enough solution for reducing 
IPM potential as long as the width of w( f + A) has not increased significantly, i.e. ||w(f + A)|| < 
O(1) wP) 

If the solution on the updated graphs G/F),...,G/Fp does not have a good enough quality, 
we know by the above discussion that |/w(f + A)||, > 100||/w(f)||, must hold. Then, we re- 
compute the collection of LSDs of G and solve (1) on the new collection of G/F\,...,G/Fp again. 
Because each recomputation reduces the 44 norm of the width by a constant factor, and all the 
widths are bounded by exp(log?“) m) (as discussed in Section 2.1), there can be at most O(1) such 
recomputations. At the top level, this only increases our runtime by O(1) factors. 

The real situation is much more complicated since we recursively maintain the solutions on the 
spanners of each G/F\,...,G/F. Hence, it is possible that lower levels in the data structure are 
the “reason” that the quality of the solution is poor. More formally, let T be the total number 
of IPM iterations. We use t € [T] to index each iteration and use superscript x to denote the 
state of any variable x after ¢-th iteration. For example, f® is the flow computed so far after t 
IPM iterations and we define w® = w(f) to be the width w.r.t. fË. Recall that every graph 
maintained in the dynamic B-Branching Tree Chain re-computes its collection of LSDs after certain 
amount of updates. When some graph at level 7 re-computes, we enforce every graph at the same 
level to re-compute as well. Since there’s only m°“) such graphs at each level, this scheme results 
in a m°“) overhead on the update time which is tolerable. For every level i = 0,...,d, we define 
prev") to be the most recent iteration at or before t that a re-computation of LSDs occurs at level 7. 
For graphs at level d which contain only m°) vertices, we enforce a rebuild everytime and always 
have prev = t. We show in Lemma 7.9 that the cycle output by the data structure in the t-th 
IPM iteration has length at most 


d 


mD S ap rE) I. 
i=0 


This inequality is a natural generalization of the O(1) (||w(f)||, + lw(f + A)||,)-bound when tak- 
ing recursive structure into account. 
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At this point, we want to emphasize that the fact that we can prove this guarantee depends on 
certain “monotonicity” properties of both our core and spanner graph constructions. In the core 
graph construction, it is essential that we can provide a fixed length upper bound for most edges. 
In the spanner construction, we crucially use that the set of edges routing into any fixed edge in the 
spanner is decremental for most spanner edges. This allows us to produce an initial upper bound 
on the width for edges in the spanner and continue using this bound as long as the spanner edge 
routes a decremental set. 

The cycle output by the data structure yields enough decrease in the IPM potential if its 
l-norm is small enough. Otherwise, the 1-norm of the output cycle is large and we know that 


(t) s 2 
ELpy wP: ji is much more than m°®|w® ||ı. In this way, the data structure can fail because 


some lower level i has jun Prev JJ >> |w® |j. A possible fix is to rebuild the entire data structure 
which sets prev") = t at any level i. However, this costs linear time per rebuild, and this may need 
to happen almost every iteration because there are multiple levels. In the next section we show 
how to leverage that lower levels have cheaper rebuilding times (levels 7,4 -+1,...,d can be rebuilt 


in time approximately m!+°() /k*) to design a more efficient rebuilding schedule. 


2.6 The Rebuilding Game 


Our goal in Section 8 is to develop a strategy that finds approximate min-ratio cycles without 
spending too much time rebuilding our data structure when it fails to do so. In the previous 
overview section, we carefully characterized the conditions under which our data structure can fail 
against adversarial updates, given the promise that f* — f remains a good witness circulation. In 
this section, we set up a game which abstracts the properties of the data structure and the adversary. 
The player in this game wants to ensure our data structure works correctly by rebuilding levels of 
it when it fails. We show that the player can win without spending too much time on rebuilding. 
Recall w) = w(f (4)) is a hidden vector that we use to upper bound the £4; cost of the hidden 
witness circulation A(f). We will refer to ||w ||; as the total width at time t. We argued in the 
previous Section 2.5 that our branching-tree data structure can find a good cycle whenever the 


total width |/w)||, is not too small compared to the total widths at the times when the levels 
(t) 


0,1,...,d of the data structure were last initialized or rebuilt. We let prev;” denote the stage when 


level i was last rebuilt, and refer to Jwt as the total width at level i. As we saw in the 
previous section, the only way our cycle-finding data structure can fail to produce a good enough 
cycle is if 745 jw Prev”) lı > m°) jjw® ||. We can estimate the quality of the cycles we find, and 
if we fail to find a good cycle we can conclude this undesired condition holds. However, even if the 
condition holds, we might still find a good cycle “by accident”, so finding a cycle does not prove 
that the data structure currently estimates the total width well. Because the total widths ||w® |j 
are hidden from us, we do not know which level(s) cause the problem when we fail to find a cycle. 

We turn this into a game that abstracts the data structure and IPM and supposes that total 
width Ilo IL. is an arbitrary positive number chosen by an adversary, while a player (our pro- 
tagonist) manages the data structure by rebuilding levels of the data structure to set prev”) = t 
when necessary. Now, because of well-behaved numerical properties of our IPM, we are guaranteed 
that log(|/w||,) € [—poly log(m), poly log(m)], and we impose this condition on the total width in 
our game as well. By developing a strategy that works against any adversary choosing such total 
widths, we ensure our data structure will work with our IPM as a special case. In Definition 8.1 
we formally define our rebuilding game. 

In our branching tree data structure, level i can be rebuilt at a cost of m!+°™/k! and it can 
last through roughly m!~°() /k? cycle updates before we have to rebuild it because the core graph 
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has grown too large (we call this a “winning rebuild”). But, if we are unable to find a good cycle, 
we are forced to rebuild sooner (we call this a “losing rebuild”). Which level should we rebuild if 
we are unable to find a good cycle? The answer is not immediately clear, because any level could 
have too large total width. However, by tuning our parameters such that the m°) factor in our 


condition Zo jw Prev) 15 > mM) ||w ||, is larger than 2(d +1), we can deduce that if a failure 
) za 
occurs, then max% o jwg Previ I, > 2|w®||ı. Thus, if the total width at level i is too large, then a 


losing rebuild at level ¿ (and hence updating wirrey,*”) to w®) will reduce its total width by at 
least a factor 2. 

This means that for any level i, if we do a losing rebuild of level i poly log(m) times before a 
winning rebuild of level i, we can conclude that the too-large total width is not at level 7. This 
leads to the following strategy: Starting at the lowest level, do a losing rebuild of each level 7 up 
to poly log(m) times after each winning rebuild, and then move to rebuilding level i — 1 in case 
of more failures. We state this strategy more formally in Algorithm 6. This leads to a cost of 
O(m?) (m+ T)) to process T cycle updates in the rebuilding game, as we prove in Lemma 8.3. 

Finally, at the end of Section 8, we combine the data structure designed in the previous sections 
with our strategy for the rebuilding game to create a data structure that handles successfully finds 
update cycles in our hidden stable-flow chasing setting in amortized m°) cost per cycle update, 
which is encapsulated in Theorem 6.2. 


2.7 Dynamic Embeddings into Spanners of Decremental Graphs 


It remains to describe the algorithm to maintain a spanner S(G, F;) on the graphs G/F;. Let us 
recall the requirements on the spanner given in Section 2.4: 


1. Sparsity: at all times the spanner should be sparse, i.e. consist of at most O(|V(S(G, F;))|) 
edges. This is crucial for reducing the problem size and as we ensure that F; has only O(m/k) 
connected components, we have that S(G, F;) consists of O(m/k) edges, reducing the problem 
size by a factor of almost k. 


2. Low Recourse: we further require that for each update to G/F;, there are at most yr = me) 
changes to S(G, F;) on average. This is crucial as otherwise the updates to S(G, F;) could 
trigger even more updates in the B-Branching Tree Chain (see Section 2.4). 


3. Short Paths with Embedding: we maintain the spannner such that for every edge e in G, 
its endpoints in S(G, F;) are at distance at most yı - €(e) and even maintain witness paths 
Ig s(a,F,) (e) between the endpoints consisting of y, edges. This is crucial as we need an 
explicit way to check whether e $ Ilg/r,-.5(¢,m)(€) is a good solution to the min-ratio cycle 
problem. 


4. Small Set of New Edges That We Embed Into: we ensure that after each update, we re- 
turn a set D consisting of m?“) edges such that each edge e in G/F; is embedded into 
a path Ilg/p+s(a,r,)(e) consisting of the edges on the path of the old embedding path 
Ug/r,+s(G,F,)(e) of e and edges in D. 


5. Efficient Update Time: we show how to maintain S(G, F;) with amortized update time km), 


We note that additionally, we need our spanner to work against adaptive adversaries since 
the update sequence is influenced by the output spanner. Although spanners have been studied 
extensively in the dynamic setting, there is currently only a single result that works against adaptive 
adversaries. While this spanner given in [BBGNSSS20] appears promising, it does not ensure our 
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desired low recourse property for vertex splits and this seems inherent to the algorithm (additionally, 
it also does not maintain an embedding IIg/p_.s(G,r,))- 

While we use similar elements as in [BBGNSSS20] to obtain spanners statically, we arrive at a 
drastically different algorithm that can deal well with vertex splits. We focus first on obtaining an 
algorithm with low recourse and discuss afterwards how to implement it efficiently. 


A Static Algorithm. We first consider the static version of the problem on a graph G/F;, 
i.e. to give a static algorithm that computes a spanner with short path embeddings. By using 
a simple bucketing scheme over edge lengths, we can assume wlog that all lengths have unit- 
weight. We partition the graph into edge-disjoint expander graphs H 1, H2,...,H, where each 
H; has roughly uniform degree, i.e. its average degree is at most a polylogarithmic factor larger 
than its minimum degree Amin( Hi), and each vertex v in G is in at most O(1) graphs H;. Here, 
we define an expander to be a graph H; that has no cut (X,X) where X = V(H;) \ X with 


|Fy,(X,X)| < Q (cei) min{voly,(X), voly,(X)} where Ey,(X,X) is the set of edges in H; 


with endpoints in X and S and voly,(Y) is the sum of degrees over the vertices y € Y. 

Next, consider any such expander H;. It is well-known that sampling edges in expanders with 
probability p; ~ poe) gives a cut-sparsifier S; of Hj, i.e. a graph such that for each cut (X, X), 
we have |Fy,(X,X)| ~ |Es,(X,X)|/p; (see [ST04; BBGNSSS20]). This ensures that also S; is 
an expander. It is well-known that any two vertices in the same expander are at small distance, 
i.e. there is a path of length at most O(1) between them. We use a dynamic shortest paths data 
structure [CS21] for expander graphs on S; to find such short paths between the endpoints of each 
edge e in G/F; and take them to be the embedding paths (here we lose an m°) factor in the length 
of the paths due to the data structure). 

It remains to observe that each spanner S; has a nearly linear number of edges because each 
graph H; has average degree close to its minimum degree, and edges are sampled independently 
with probability p;. Thus, letting S(G, F;) be the union of all graphs S; and using that each 
vertex is in at most O(1) graphs H;, we conclude the desired sparsity bound on S(G, F;). We take 
g/r,+5(G,F,) to be the union of the embeddings constructed above and observe that the length of 


embedding paths is at most m°) as desired. 


The Dynamic Algorithm. To make the above algorithm dynamic, let us assume that there is a 
spanner S(G, F;) with corresponding embedding IIg/p-.s(G,r,) and after its computation, a batch 
of updates U is applied to G/F; (consisting of edge insertions/deletions and vertex splits). Clearly, 
after forwarding the updates U to the current spanner S(G, F;), by deleting edges that were deleted 
from G/F; and splitting vertices, we have that for some edges e € G/F;, the updated embedding 
Ug/r,+s8(G,F,)(€) might no longer be a proper path. 

We therefore need to add new edges to S(G, F;) and fix the embedding. We start by defining S 
to be the vertices that are touched by an update in U, meaning for the deletion/insertion of edge 
(u,v) we add u and v to S and for a vertex split of v into v and vu’, we add v and v’ to S. Note 
that |S| < 2|U| and that all Ig/r,,s(¢,7,)(e) that are no longer proper paths intersect with S. 

We now fix the embedding by constructing a new static spanner on a special graph J over the 
vertices of S. More precisely, for each e = (a,b) in G/F; where Ug p,_,5(¢,r,)(€) intersects with S, 
we find the vertices a,b in S that are closest to a and b on Ug/pr,-.5(¢,m)(e), and then insert an 
edge € = (G, b) into the graph J. We say that e is the pre-image of € (and € the image of e in J). 

Finally, we run the static algorithm from the last paragraph to find a sparsifier J of J and let 


I, ybe the corresponding embedding. Then, for each edge € that was sampled into J , we add its 
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pre-image e to the current sparsifier S(G, F;). 7 
To fix the embedding, for each € = (a,b) € J, we observe that since e = (a,b) was added to 
S(G, Fi), we can simply embed the edge into itself. We define for each such edge € the path 


nN 


P~ = le/r,+8(G,F,) (e)la, a] $ (a, b) © er, +s(G,r, (e)[, b] 


which is a path between the endpoints of @. This path is in the current graph S(G, F;) since we 
added (a,b) to the spanner and by definition of @, we have that Ug/r,.s(¢,n,)(e)[G, a] is still a 
proper path, the same goes for b. 

But this means we can embed each edge f = (c,d) even if its image f= (ec, d) Z J, since we can 
simply set it to the path 


Ug/rsa,r,) (fle, €] S DB P| Ollegsrssa,r)(f) [d,d]. 
n, AI) 


By the guarantees from the previous paragraph, we have that the sparsifier J has average degree 
O(1), and we only added the pre-images of edges in J to S (G, Fi). Since J (and J ) are taken over 
the vertex set S, we can conclude that we only cause Ô(|S|) = O(|U|) recourse to the spanner. 
Further, since each new path IIg s(G,r;) (e) for each e now consists of O(1) path segments from 
the old embedding Ug_,s(g,r,) (plus O(1) edges), the maximum length of the the embedding paths 
has only increased by a factor of O(1) overall. Finally, we take D to be the set of edges on P> 


for all €€ J. Clearly, each edge f embeds into a subpath of its previous embedding path (to 
reach the first and last vertex in §) and into some paths P> all of which now have edges in D. 
To bound the size of D, we observe that also each path P> is of short length since it is obtained 
from combining two old embedding paths (which were short) and a single edge. Thus, we have 
|D| = | U7 Pa | = O((J|) = O(U]) which again is only O(1) when amortizing over the number of 
updates. Figure 4 gives an example of this spanner maintenance procedure in action. 

By using standard batching techniques, we can also deal with sequences of update batches 
U® UC)... to the spanner and ensure that we cause only m?“) amortized recourse per update/ 


size of D to the spanner. 


An Efficient Implementation. While the algorithm above achieves low recourse, so far, we 
have not reasoned about the run-time. To do so, we enforce low verter-congestion of Ig/r—+s(G,F,) 
defined to be the maximum number of paths IlG/F;=s(G,F;)(€) that any vertex v € V(G/F;) oc- 
curs on. More precisely, we implement the algorithm above such that the vertex congestion of 
Ug¢/r,s(G,F,)(€) remains of order y-Amar(G/F;) for some ye = m1) over the entire course of the al- 
gorithm. We note that by a standard transformation, we can assume wlog that Aimax(G/F;) = O(k). 

Crucially, using our bound on the vertex congestion, we can argue that the graph J has max- 
imum degree y.Amaz(G/F;). Since we can implement the static spanner algorithm in time near- 
linear in the number of edges, this implies that the entire algorithm to compute a sparsifier J only 
takes time ~ |U|¥cAmar(G/F;) © |U|m°™k, and thus in amortized time km? per update. 

It remains to obtain this vertex congestion bound. Let us first discuss the static algorithm. 
Previously, we exploited that each sparsifier S; is expander since it is a cut-sparsifier of H; in a 
rather crude way. But it is not hard to see via the multi-commodity max-flow min-cut theorem 
[LR99] that this property can be used to argue the existence of an embedding IIy,-,s, that uses 
each edge in S; on at most O(1 /p;) embedding paths and therefore each path has average length 
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G/F; and spanner S(G, F;) 
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“~~ spanner edges 


spanner embedding paths Ig/r,-.s(G,r,) 


G/F; and spanner S(G, F;) 
with edges deleted €2 
deleted edges 


(©) S : touched vertices 


projected edges in J 
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edges deleted from J to obtain J 


spanner embedding paths Ilj j 


updated G/F; and 
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spanner S(G, F;) 


spanner edges 


spanner embedding paths IIG/F;—>S(G,F;) 


Figure 4: Illustration of the procedure for maintaining S(G, F;) under edge deletions. 
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O(1). In fact, using the shortest paths data structures on expanders [CS21], we can find such an 
embedding and turn the average length guarantee into a worst-case guarantee. 

This ensures that each edge has congestion at most O(1/p;) = O(Amar(G/F;)) and because 
S(G, F;) has average degree O(1), this also bounds the vertex congestion. We need to refine this 
argument carefully for the dynamic version but can then argue that due to the batching we only 
increase the vertex congestion slightly. We refer the reader to Section 5 for the full implementation 
and analysis. 


3 Preliminaries 


Model of Computation. In this article, for problem instances encoded with z bits, all algorithms 
work in fixed-point arithmetic where words have O(log?) z) bits, i.e. we prove that all numbers 
stored are in [exp(— log? z), exp(log?™ z)}. 


General notions. We denote vectors by boldface lowercase letters. We use uppercase boldface to 
denote matrices. Often, we use uppercase matrices to denote the diagonal matrices corresponding 
to lowercase vectors, such as L = diag(@). For vectors x,y we define the vector xo y as the 
entrywise product, i.e. (x o y); = ziyi. We also define the entrywise absolute value of a vector 
|x| as |a|; = |a;|. We use (-,-) as the vector inner product: (x,y) = ly = J; vy;. We elect 
to use this notation when x,y have superscripts (such as time indexes) to avoid cluttering. For 
positive real numbers a,b we write a ~a b for some a > 1 if a~!b < a < ab. For positive vectors 
zy €E Rel we say £ Xa Y if £i Xa Yi for all i € [n]. This notion extends naturally to positive 
diagonal matrices. We will need the standard Chernoff bound. 


Theorem 3.1 (Chernoff Bound). Suppose X1, X2,..., Xk € [0, W] are independent random vari- 
ô 
ables, X = X; X; and p= EX. For any 6 > 1/2, we have PIX € [(1 — ôu, (1 + 5)u]] > 1— 2e7 3w. 


Graphs. In this article, we consider multi-graphs G, with edge set E(G) and vertex set V (G). 
When the graph is clear from context, we use the short-hands F for E(G), V for V(G), m = 
|E|,n = |V|. We assume that each edge e € E has an implicit direction, used to define its edge- 
vertex incidence matrix B. Abusing notation slightly, we often write e = (u,v) € E where e is an 
edge in E and u and v are the tail and head of e respectively (note that technically multi-graphs 
do not allow for edges to be specified by their endpoints). We let rev(e) be the edge e reversed: if 
e = (u,v) points from u to v, then rev(e) points from v to u. 

We say a flow f € RË routes a demand d € RY if B! f = d. For an edge e = (u,v) € G we let 
be € RY denote the demand vector of routing one unit from u to v. 

We denote by degg(v) the degree of v in G, i.e. the number of incident edges. We let Amax(G) 
and Amin(G) denote the maximum and minimum degree of graph H. We define the volume of a 


set S C V as volg(S) = Zes degg(v). 


Dynamic Graphs. We say G is a dynamic graph, if it undergoes batches U, U),... of updates 
consisting of edge insertions/ deletions and/or vertex splits that are applied to G. We stress that 
results on dynamic graphs in this article often only consider a subset of the update types and we 
therefore explicitly state for each dynamic graph which updates are allowed. We say that the graph 
G, after applying the first t update batches U,U®),...,U, is at stage t and denote the graph 
at this stage by G. Additionally, when G is clear, we often denote the value of a variable x at 
the end of stage t of G by «™, or a vector x at the end of stage t of G by v2). 
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For each update batch U), we encode edge insertions by a tuple of tail and head of the new 
edge and deletions by a pointer to the edge that is about to be deleted. We further also encode 
vertex splits by a sequence of edge insertions and deletions as follows: if a vertex v is about to be 
split and the vertex that is split off is denoted vN=W, we can delete all edges that are incident to 
v but should be incident to vNEW from v and then re-insert each such edge via an insertion (we 
allow insertions to new vertices, that do not yet exist in the graph). 

For technical reasons, we assume that in an update batch U“, the updates to implement the 
vertex splits are last, and that we always encode a vertex split of v into v and vNEW such that 
deg ity (VNEW) < degge+ı) (v). We let the vertex set of graph G®) consist of the union of all 
endpoints of edges in the graph (in particular if a vertex is split, the new vertex vNEW is added due 
to having edge insertions incident to this new vertex vNEW in U ©), 

Enc(u) of an update u € U(t) be the size of the encoding of the update and note that for 
edge insertions/ deletions, we have ENC(u) = O(1) and for a vertex split of v into v and vNEW 
as described above we have ENC(u) = O(deggu+1) (vNEW)), For a batch of updates U, we let 
ENc(U) = Jey ENC(u). In this article, we only consider dynamic graphs where the total size of 
the encodings of all update batches is polynomially bounded in the size of the initial graph G). 

We point out in particular that the number of updates |U| in an update batch U can be 
completely different from the actual encoding size ENC(U) of the update batch U. 


Paths, Flows, and Trees. Given a path P in G with vertices u,v both on P, then we let P[u, v] 
denote the path segment on P from u to v. We note that if v precedes u on P, then the segment 
Plu, v] is in the reverse direction of P. For forests F, we similarly define F|u, v] as the path from 
u to v along edges in the forest F. We ensure that u,v are in the same connected component of F 
whenever this notation is used. 

We let p(F[u,v]) € R& denote the flow vector routing one unit from u to v along the path 
in F. In this way, |p(F'[u, v])| is the indicator vector for the path from u to v on F. Note that 
p(F lu, v]) + p(F |v, w]) = p(F[u, w]) for any vertices u,v,w € V. 

The stretch of e = (u,v) with respect to a tree T is defined as 


(L IPT DN j | Everun te 


TL def 
stro” = 1 + 
g le Lle 


This differs slightly from the more common definition of stretch because of the 1+ term — we 
do this to ensure that str? > 1 for all e. It is known how to efficiently construct trees with 
polylogarithmic average stretch with respect to underlying weights. These are called low-stretch 
spanning trees (LSSTs). 


Theorem 3.2 (Static LSST [AN19]). Given a graph G = (V,E) with lengths € € RË) and 
weights v € RE, there is an algorithm that runs in time O(m) and computes a tree T such that 
Veer vestr? < O(||v|]1 log n log log n). 


We let yrssr = O(log nlog log n). 


Graph Embeddings. Given graphs G and H with V(G) C V(#), we say that Igy is an 
graph-embedding from G into H if it maps each edge ef = (u,v) € E(G) toa u-v path IIg— p (ef) in 


H. We define congestion of an edge e” by econg(IHg—+n, e”) = Hef € E(G) | e” € Me+x(e%)}| 
and of the embedding by econg(IIg— p) E maxen c p(n) econg en, eĦ). Analogously, the con- 


gestion of a vertex v” € V (H) is defined by vcong(le— p, uv”) © Hef € E(G) | v” € Ie y(e®)}] 
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and the vertex-congestion of the embedding by vcong(IIg-+1) = Max,#ey(q) veong(IIg—n, v”). 


We define the length by length(IIc— p) = mMaXeCEE(G) |Ic—y(ef)|. We let (boldface) Hg yle) € 
R* for e = (u,v) denote a vector representing the flow from u > v. Thus B'TIg_,7(e) = be. 

For a path pı with endpoints u —> v and pə with endpoints v > w, we define pı ® po as the 
concatenation, which is a path from u > w. 

Sometimes we consider the edges that route into an edge e € E(H). Given graphs G, H and 
embedding IIcp, for edge e € E(H) we define Hg}, ,(e) © {e! € G : e € I(e')}. The notation is 
natural if we think of II as a function from an edge e to the set of edges in its path, and hence II~! 
is the inverse/preimage of a one-to-many function. 


Dynamic trees. Our algorithms make heavy use of dynamic tree data structures, so we state a 
lemma describing the variety of operations that can be supported on a dynamic tree. This includes 
path updates either of the form adding a directed flow along a tree path, or adding a positive value 
to each edge on a tree path. Additionally, the data structure can support changing edges in the 
tree, and querying flow values on edge. Each of these operations can be performed in amortized 
O(1) time. 


Lemma 3.3 (Dynamic trees, see [ST83]). There is a deterministic data structure DT? that main- 
tains a dynamic tree T C G = (V, E) under insertion/deletion of edges with gradients g and lengths 
£, and supports the following operations: 


1. Insert/delete edges e to T, under the condition that T is always a tree, or update the gradient 
ge or lengths le. The amortized time is O(1) per change. 


2. For a path vector A = p(T[u, v]) for some u,v € V, return (g, A) or (£, |A|) in time O(1). 


3. Maintain a flow f € RË under operations f <— f + n& forn € R and path vector A = 
p(T[u, v]), or query the value fe in amortized time O(1). 


4. Maintain a positive flow f € RE, under operations f + f+nlAl for n € Rso and path vector 
A = p(T[u, v]), or or query the value fe in amortized time O(1). 


5. DETECT(). For a fixed parameter £, and under positive flow updates (item 4), where A is 
the update vector at time t, returns 


sO E vce E:£ YO [AM ze (7) 
t’€ [lasts +1, 2] 
where last“ is the last time before t that e was returned by DETECT(). Runs in time O(|S)). 


Proof. Every operation described is standard except for DETECT, which we now give an algorithm 
for. Note that (7) is equivalent to the following: 


D AM -F 20. 
le 
t/eflastS +1,t] 


This value can be maintained using positive flow updates (item 4), i.e. |A| to a tree path. We 
reset the value of an edge e to —e/£- once it is detected. Locating and collecting edges satisfying 
(7) is reduced to finding edges with nonnegative values, which can be done in O(|S]) time by 
repeatedly querying the largest value on the tree, and checking whether it is nonnegative. 
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The DETECT operation allows our algorithm to decide when we need to change the gradients 
and lengths of an edge e in our IPM. 


4 Potential Reduction Interior Point Method 


The goal of this section is to present a primal-only potential reduction IPM |[Kar84] that solves the 
min-cost flow problem on a graph G = (V, E) with demands d € ZV, lower and upper capacities 
u-,ut € ZË, and costs c € ZË such that all integers are bounded by U: 


fz arg min cf. (8) 
B! f=d 
Ue <fe<ut for all e€E 


Instead of using the standard logarithmic barrier, we elect to use the barrier x~° for small a. 
This is because we do not know how to prove that the lengths encountered during the algorithms 
are quasipolynomially bounded for the logarithmic barrier. Precisely, we consider the following 

def 


potential function, where F* = eT f* is the optimal value for (8), and a = 1/(1000 log mU). We 
assume that we know F*, as running our algorithm allows us to binary search for F™. 


D(f) = 20mlog(e" f — F*) + X ((ud - fe)-* + (fe — uz) *) (9) 
eck 
We show in Section 4.3 that we can initialize a flow f on a larger graph (still with O(m) edges) 
such that the potential ®(f) is initially O(mlogmU) (Lemma 4.12). Additionally, given a nearly 
optimal solution, we can recover an exactly optimal solution to the original min-cost flow problem 
in linear time (Lemma 4.11). A simple observation is that if the potential is sufficiently small, then 
the cost of the flow is nearly optimal. 


Lemma 4.1. We have c! f — F* < exp(®(f)/(20m)). In particular, if E(f) < —200mlog mU 
then c! f — F* < (mU)-™. 


Proof. From (9) and the fact that uz < fe < uł, we get 


®(f) > 20mlog(c! f — F*). 


Rearranging this gives the desired result. 


Given a flow f € R? we define lengths £ € RE and gradients g € RË to capture the next 41 
problem we solve to decrease the potential. 


Definition 4.2 (Lengths and gradients). Given a flow f € RE we define lengths £ € RË as 
e —l-a —_\—l-a 
af). = (ut - Fe) + (fe - uz) (10) 


and gradients g € RE as g(f) = V®(f). More explicitly, 


def 


a( fle = [VO(F)]. = 20m(eT f - F*) ce +a (ut - i) —a(fe-uz)* (11) 


The remainder of the section is split into three parts. In Section 4.1 we show that approximately 


solving the cycle problem induced by gradients and lengths approximating those in Definition 4.2 
allows us to decrease the potential additively by an almost constant quantity in a single iteration. 
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Then in Section 4.2 we bound how such iterations affect the lengths and gradients in order to 
show that approximate versions of them only need to be modified m!+°) times across the entire 
algorithm, and in Section 4.3 we discuss how to get an initial flow and extract an exact min-cost 
flow from a nearly optimal flow. 

The following theorem summarizes the results of this section. 


Theorem 4.3. Suppose we are given a min-cost flow instance given by Equation (8). Let f* denote 
an optimal solution to the instance. 
For all k € (0,1), there is a potential reduction interior point method for this problem, that, 
given an initial flow f© € RE such that &(f) < 200m log mU, the algorithm proceeds as follows: 
The algorithm runs for O(mk?) iterations. At each iteration, let g(f) € RË denote that 
gradient and £(f) € RE, denote the lengths given by Definition 4.2. Let g € R? and £ € RE, be 


any vectors such that Ero) (9 — o(f()) | < K/8 and E œ &(f). 
1. At each iteration, the hidden circulation f* — f satisfies 


g'(f* — f%) 
100m + Lor = œl, 


< -a/4. 


2. At each iteration, given any A satisfying B! A = 0 and g' A/\|EAl|, < —q, it updates fet) 4+ 
FO + nM for n + K?/(50- |gA)). 


3. At the end of O(mk?) iterations, we have c! f < cT f* +(mU)7"%. 


Intuitively, the algorithm will compute a sequence of flows f©, f™, and maintain approxi- 
mations g, £ of g(f™), &(f™) respectively. Each iteration, the algorithm will call an oracle for 
approximating the minimum-ratio cycle, i.e. mingt a—o 9'4/ ||LA||,. The first item shows that the 
optimal ratio is at most —a/4. Thus if the oracle returns an m°“) approximation, the returned 
circulation has k > m°), Scaling A appropriately and adding it to f decreases the potential 
by 2(«), hence the potential drops to —-O(mlog m) within O(m«~?) iterations. 

In Section 9 we will give a formal description of the interaction of the algorithm of Theorem 4.3 
and our data structures to implement each step in amortized m°) time. As part of this, we argue 
that we can change g and £ only O(mk~?) total times. This is encapsulated in Lemma 9.4. 


4.1 One Step Analysis 


Consider a current flow f and lengths/gradients €(f),g(f) defined in Definition 4.2, with L = 
diag(£). The problem we will solve approximately in each iteration will be 


(f)'A 


min =, (12) 
B A=0 ||L(f)A]l, 
Alternatively, this can be viewed as constraining B! A = 0 and g(f)' A = —1, and then minimizing 


|L(f)Al|,. Our first goal is to show that an approximate solution to (12) for approximations of 
the gradient and lengths allows us to decrease the potential. 


Lemma 4.4. Let g € RË satisfy ||L(f)~! (g — 9(f)) ||, < 5/8 for some k € (0,1), and £ € RE, 
satisfy L ~ E(f). Let A satisfy B' A =0 and g' A/|LaAl|, <—k. Letn satisfy ng! A = —K?/50. 


Then 


k2 


O(f + 9A) < al) -E 
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Before showing this, we need simple bounds on the Taylor expansion of the logarithmic barrier 
and « ® in the region where the second derivative is stable. 


Lemma 4.5 (Taylor expansion for 2°). If |Ae| < 75 min (uf — fe, fe — uz) fore € E then 
(Cut = fe — Me)" + (fe + Ae — uz)~*) < ((ud = fe) + (fe - uz) *) 
+a((ut -4 "= (fear) *) Acta ((ut-f) + (ue) *) A? (3) 
Also we have that 
[ut = fe- Ae) 1? = (ud = FT| < 2Ae|(ud — fe)? (14) 
and 
(fet Ae — ig) 8 = (fo — ug) | SOA. feu)? (15) 


(14) and (15) are useful for analyzing how a step improves the value of the potential function 
(f), as well as showing that the gradients g(f) and lengths (f) are stable, i.e. change only 
m+) times over m+) iterations. 


Proof. Define 6(2) = (ut — x)= + (x — uz)~®. ¢ is a convex function with derivative 
g'(x) = a ((ug — 2)7™® - (x uz) 9) 
and second derivative 
o"(a) = a(1 +a) ((ug — 2) 8 + (@— uz)? *). 


In particular note that ¢’(f. + 6) %13 ¢"(fe) for any |6| < ip min (ud — fe, fe — uz), because 
1.17+ < 1.3 by the choice of a. Thus by Taylor’s theorem we get that 


(fe + Me) < O(fe) + H(Fe)Ae t+ = by) 2 < O(fe) + O'(fe)Me + 1.36" (fe) A? 


max 
2 yelfe, fete] 


< Alfe) + b(fe)Ac ta (ut =i) + (fe- uz)? *) Az, 


which when expanded yields the desired bound. (14), (15) follow from a similar application of 
Taylor’s theorem on a first order expansion. 


Lemma 4.6 (Taylor expansion for log x). If |y| < 2/10 for x >0 then 
log(a + y) < log(x) + y/a + y*/a?. (16) 


Proof. This is equivalent to log(1 + y/x) < y/x + y?/2? for |y/x| < 1/10, which follows from the 
Taylor expansion log(1 + 2) = 49 2*/k for |z| < 1. 


Proof of Lemma 4.4. We first bound g(f)' A by 


() 
< 


CORNE 


LATE- A| ILAA $ 2/5: 87A] < "A2, 
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where (i) follows from Hélder’s inequality with the ¢1,/.. norms, (ii) follows from the lemma 
hypotheses and ||L(f) Al]; < 2||LA]||ı, and the final inequality follows from £ < «/8. Hence 


2g' A<g(f)'A<g'A/2. (17) 


We can also bound c! A by 


20m(c" f- F*) |e" A] = |g(f)"A-2 (C - u — (fe- uz) 7") A.| (18) 
eck 
S JITA] +a LFA È 2+ 20/5)" Al. (19) 


where (i) follows from the triangle inequality, and (ii) from (17) plus the problem hypotheses. In 
particular, we deduce that 


c! f— F* 


eraj” 
Om 


-(2+2a/K)|g" Al. (20) 


def 


Let A = 7A, the circulation that we add. From (20) we get that 


T * T * 
TH c f- F ~T e J=F ~T K T * 
A|<7- -(242 A] < —.———__ -4 A|< —F 
Je" A] <n 3 — - (2 + 2a/n)|G A] a -4n/ ng Al < SO (ef - F*) 
by the choice of 7 in the problem hypothesis. Additionally, we have 
jupa], <2|[LAl|, < 2/5- |A|, < 27/5- IGA, < x /25 (21) 


by the choice of 7. This implies that 
[Ae] < 6/25 - (ug — fe)'t* < K/25 - (2U)*- (us — fe) < w/10- (ud — fe), 


where the last inequality follows from the choice a = 1/(1000logmU). [Ae] < «/10- (fe — uz) 
follows similarly. 
This bound allows us to apply: 


e Lemma 4.5 on the current u*, u`, f, and A, and 
e Lemma 4.6 for z = c! f — F* and y= c' A 


to get 


) k? K ak K 

10-|L(f)A < : 
ooo + 2*/10- LAA, < -i0 + 10000 + 20 € ~ B00 
Here, (i) follows from (17), and the bound |Ae| < «/10- min (ut — fe, fe—uz), and (ii) from 
g' A=ng' A = —«?/50 and (21). 


< g'A/2+—_ 


Our next goal is to show that a straight line to f*, i.e. A = f* — f satisfies the guarantees of 
Lemma 4.4 for some K > Q). This has two purposes. First, it shows that an m°®-optimal solution 
to (12) allows us to decrease the potential by m~° per step, so that the algorithm terminates in 
m!+o()) steps. Second, it shows that the problems (12) encountered during the method are not fully 
adaptive, and we are able to use this guarantee on a good solution to inform our data structures. 
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Lemma 4.7 (Quality of f*—f). Let g € RË satisfy ||L(g)~' (g — 9(F))||,, < £ for some e < a/2, 
and £ € RE, satisfy l xo Lf). If ®(f) < 200mlogmU and log(c! f — F*) > —10log mU, then 
g (f* — f) 
100m + [Ece = f)| 


< -a/4. 


The additional 100m in the denominator is for a technical reason, and intuitively says that 
the bound is still fine even if we force every edge to pay at least 100 towards the @; length of the 
circulation. 


Proof. We can bound g(f)' (f* — f) using 


Dp = 202 FD 4 (a(i) - (fe —ue) 7) (HE - 0 


ecE 


EE >D (ut = t) + (fe - uy) Here 


ech 
+2a 57 (ut - fe) +(e- uz) *) 
ecH 
= -20m — a||L(F)(F* — Ply + 2a (@(F) — 20m log(eT f — F*)) 
T ame a |L(A)(F* pi a log mU 
(iii) 
< —19m—-a||L(f)(f* — Ff), - 


where (i) follows from the bound uz — fe < fi — fe < ut — fe for all e € E, so 
(ut — fe)” ua “(fè -— fe) = ut — fe) °(fe — ut + ut — fe) 

Ue fe) i (u7 fe) ; “fe — už] 

salut =f (aye) e Fels 


and similar for the — (uf — fe) ~ 17° (fž — fe) term, (ii) follows from the lemma hypotheses, and 
(iii) from the choice a = 1/(1000 log mU). We now bound 


F(E- f) =9(f) (Ff -9) + (G-9(F))' (Ff - f) 
© —19m — a LAE = Pla + [LEA E- oF) ||_ LINE — Dh 


(ii) z 

< -19m — a||L(F)(F* — fll + EILA — Pih 

< —19m — a/2- [LAG = Pla 
where (i) follows from the above bound on g(f)'(f* — f) and Hölder for the ¢;/¢., norms, and 
(ii) follows from the the conditions on g. Thus we get that 


TF= < 19m -0/2 ILAG = Al < 
100m + Ler = I, ~ 100m+2|L(f)(f* -Ah 


a/4, 


where we have used the above bound on g! (f* — f) and £2 l. 
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4.2 Stability Bounds 


Our algorithm ultimately approximately solves (12) by using approximations g of g(f) and l of 
£(f) satisfying the conditions of Lemma 4.4. The goal of this section is to show that g(f) and &(f) 
are slowly changing relative to the lengths, so that our dynamic data structure can only update 
their values on m?\) edges per iteration. 

We start by showing that the residual cost is very slowly changing, by about 1/m per iteration. 


Lemma 4.8 (Residual stability). Let g € R? satisfy ||L(f)~' (g —9(f))||,, < © for some e € 
(0,1/2], and £ € RE, satisfy E ~ lf). Let A satisfy B'A = 0 and I g'A/|Lal|, < < -—« for 

€ (0,1). Then 
Ic Al 


c fF = < |g! Al/(Km). 


Proof. We can write 


TAL © 
an =n (la Al + olfL(F) Ah) 
(ii 
Sa LETA + LUA)" (Allo L(A AI + ale A) 


ea (Ig A] + 2|LAl|1)/(20m) < |g" Al/(Km), 


where (7) uses the triangle inequality, (77) uses the triangle inequality and |x y| < lælllyllı, and 
(iii) uses the hypotheses and £ ~z &(f). 


Hence if |g(f)'A| < O(k?) and ||L(f)Al|; = O(«) such as in the hypotheses of Lemma 4.4, 
the residual cost changes by at most a 1/m factor per iteration. 
We show that if the residual capacity of an edge does not change much, then its length is stable. 


Lemma 4.9 (Length stability). Jf ||L(f)(f — Dllo < £ for some e < 1/100 then &(f) ~143. Lf). 


Proof. Because [LAOG — DI < 1/100, we have for all e € E 


|fe- Fel < e(us — fe)'** = e(us — fe)(2U)* < 2e(u? — fe). 


Similarly, |fe = f.l < 2e(fe — uz ). Hence us —fe® ™exp(2e) ud =f. and fe — Ue ~exp(2e) fe — Ue ; 
so we get 


Pea Ws =f)" H Geo) 
~exp(2e(1+a)) (us E f= F (fe = uj = Lf )e- 


This completes the proof, as 2e(1 + a) < 3e. 


Next we show a similar stability claim for gradients. Here, we scale by the residual cost c! f — F* 
to ensure that the leading term is 20mc. Thus, the gradient is stable if the residual capacity of an 
edge does not change much, and if the residual cost is stable. We know that the residual cost is 
stable over O(m) iterations by Lemma 4.8. 
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Lemma 4.10 (Gradient stability). If ||L(f)(f—Ff)|lo < £ andr %14 c! f — F* then g defined as 
Ge = 20me./r + alut — fe) 1e — alfe —uz) 1° for alle € E 
satisfies 
[E (g - (ef - F*)/r-9(F))|] < Gae. 
Proof. We can first compute that 
lg@—(e"F-F*)/r-g |. 

= ( -57 ) (oud — Fe) oale] (22) 

POG Si.) Sule Aa Sea) (23) 
We bound the terms in (22) and (23) separately. To bound (22) we write 


OO 


(1-29) funy (etag a) 


= -Ef a af Siew (ia Soca 
E 2 


r 


where (i) follows from the hypothesis and Lemma 4.9. To bound (23) we write 


a |L (ut - A - (ut -= Fy) - (E - wr) - F- u) 


OO 


AS 
w 

Ea 
a 


(Fut — fy + (Fwy VF - Fl 
E= fe), (Fe — Ue)“ HILEN — Phlloo < 2a(2U)%e < dae 


IA 
iw) 
a 
B 
v 
Ps 

F 
g 


where (i) follows from Lemma 4.5, specifically (14), (15). Summing these gives the desired bound. 


We now show the main result of this section, Theorem 4.3. 


Proof of Theorem 4.3. The first item follows from Lemma 4.7. To show the third item, note that 
the update in the second item exactly corresponds to Lemma 4.4, so ®(f) < 6(f—-)) — O(K?). 
Once the potential has reduced to —O(mlogm), then ce! f — c! f < (mU)~!° (Lemma 4.1), so 
the algorithm takes O(mx~?) total iterations. 


4.3 Initial and Final Point 


In this section we discuss how to initialize our method and how to get an exact optimal solution 
from a nearly optimal solution. For the latter piece, we can directly cite previous work which gives 
a rounding method using the Isolation Lemma. 


Lemma 4.11 ({[BLNPSSSW20, Lemma 8.10]). Consider a min-cost flow instance T = (G,d,c) on 
a graph G = (V, E) with demands d € {-U,...,U}” and cost c € {—-U,...,U}". Assume that all 
optimal flows have congestion at most U on every edge. 

Consider a perturbed instance I = (G,d,@) on the same graph G = (V,E) and demand d, 
but with modified cost vector č € R? defined as Če = Ce + Ze for independent, random ze € 
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{aur ee ee ea for alle € E. Let f be a solution for T whose cost is at most more 


from optimal. Let f be obtained by rounding f to the nearest integer on every edge. Then f is an 
optimal flow for the instance T with probability at least 1/2. 


It is worth noting that scaling up the cost vector č of the perturbed instance T in Lemma 4.11 
by 4m?U? results in a min-cost flow instance with integral demands and costs again. 

Now we describe how to augment our original graph with additional edges without affecting 
the optimal solution, but allows us to initialize a solution with bounded potential. The proof is 
deferred to Appendix B.1. 


Lemma 4.12 (Initial Point). There is an algorithm that given a graph G = (V,E) and min-cost 
flow instance T = (G,d,c, ee with demands d € {—-U,...,U}", and costs and Ue aT 
capacities c,u-,u+ € {—-U,...,U}Y, constructs a min-cost flow instance T= (a, d, č, ù, wu) 
with O(m) edges and d € {—2mU,. ,2mU}V@), čut, ù € {-4mU?,...,4mU? G) and a 
flow f™*) on G routing d such that (fini) < 200m log mU. 

Also, given an optimal flow f for T, the algorithm can either compute an optimal flow f forT 
or conclude that T admits no feasible flow. The algorithm runs in time O(m). 


5 Decremental Spanner and Embedding 


The main result of this section is summarized in the following theorem. Intuitively, the theorem 
states that given a low-degree graph G, one can maintain a sparsifier H of G and embed G with 
short paths and low congestion into H. 


Theorem 5.1. Given an m-edge n-vertex unweighted, undirected, dynamic graph G undergoing 
update batches U),U®),... consisting only of edge deletions and O(n) vertex splits. There is a 


log(m) 


randomized algorithm with parameter 1 < L < o a 


i that maintains a spanner H and an 


embedding Ug. such that 


1. Sparsity and Low Recourse: initially H© has sparsity O(n). At any stage t > 1, the algorithm 
outputs a batch of updates ue? that when applied to Ht) produce H® such that H® C 
G, H© consists of at most O(n) edges and vet Eno(u ) = = O(n oes u]. mu), 
and 


2. Low Congestion, Short Paths Embedding: length(IIg44) < (y) and veong(IIg47) < 
(Ye)? Amax(@), for 1, Ye = exp(O( Vlog : log log m)), and 


3. Low Recourse Re-Embedding: the algorithm K reports after each update batch U® at 
stage t is processed, a (small) set D® C E(H®) of edges, such that for all other edges 
e € E(H)\ D, there exists no edge e' € E(G™) whose embedding path Mhai contains 
e at the current stage but did not before the stage. The algorithm ensures that at any stage 
t, we have X y< ID®)]| =O (Ers ju| . neye a, i.e. that the sets D are roughly 
upper bounded by the size of U® on average. 


The algorithm takes initialization time O(m) and processing the t-th update batch U® takes 
amortized update time O(ENC(U) -n4/4(qe)PM Amax(G@)), and succeeds with probability at least 
1—n-© for any constant C > 0, specified before the procedure is invoked. 
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Taking L = (logm)!/4 in Theorem 5.1 gives a parameter ys = exp(O(log?/4 mlog log m)) such 
that the amortized runtime, lengths of the embeddings, and amortized size of D are all O(ys). We 
emphasize that the guarantees 1 and 3 are with respect to the number of updates in each batch 
U) and not with respect to the (possibly much larger) encoding size of U®. This is of utmost 
importance for our application. 

In this section, we will prove Theorem 5.1 under the assumption that the update sequence is 
bounded by n — 1 and that each update batch consists only of a single update. This is without loss 
of generality as one can restart the algorithm every n updates without affecting any of the bounds. 


5.1 The Algorithm 


Data Structures. Our algorithm to implement Theorem 5.1 works with a rather standard batch- 
ing approach with L + 1 levels. The algorithm therefore maintains graphs Ho, Hı,..., Hg which 
form the sparsifier H = UH; implicitly. The algorithm recomputes each graph H; every so often, 
with shallower levels being recomputed less often than deeper levels. However, each H; undergoes 
frequent changes since after each stage we apply the updates to G to each graph H; (if applicable) 
such that the graphs remain subgraphs of G with the same vertex set at any stage. 

It further maintains embeddings Ilo, Ih,...,1z where each embedding II; maps a subset 
of E(G) into the graph H<; = Uj<; Hi. In the algorithm, the pre-image of the embeddings 
IIo, 1h,..., Uz is not disjoint, and in fact, we let Io always have the full set E(G) in its pre-image. 
We define the embedding I<; which to be the embedding that maps each edge e € E(G) via the 
embedding I; with the largest i < j that has e in its pre-image. 

Whenever we recompute a graph H;, we also recompute II; such that after recomputation I<; 
embeds the current graph G into the current graph H<;. As for the graphs H;, we apply updates to 
G to the embedding paths in II; which means that eventually an edge e € E(G) with endpoints a, b 
might not be mapped by IJ;(e) to an actual a-b path; either because edges on II;(e) are deleted, or 
vertices are split or both. However, the algorithm ensures that most edges are correctly mapped via 
I<; at all times and the small fraction of edges that is not properly dealt with are then dealt with 
by embeddings I]j41, Wj+2,..., Mz on deeper levels. The embedding IIg is again maintained 
implicitly and defined IGH Ej H<. 

Finally, we maintain sets So, S1,..., Sz. Each set S; consists of the vertices that are touched 
by the updates to G since the last time that H; was recomputed. We give a formal definition for 
touched vertices below. 


Definition 5.2. We say that the t-th update to G touches a vertex u if the update is an edge 
deletion and u is one of its endpoints, or if the update is a vertex split and u is one of the resulting 
vertices from the split. 


Initialization. We start our algorithm by running the sparsification procedure below to initialize 
Ho and Io. 


Theorem 5.3. Given an unweighted, undirected graph G. There is a procedure SPARSIFY(G') that 
produces a sparse subgraph Ho C G and an embedding Uo of vertex-congestion at most 2ycAmax(G) 
and length at most qı, with high probability. The algorithm takes time O(my). 


For j > 0, we initialize H; to the empty graph and II; to be an empty map. We set all sets S} 
(including So) to the empty vertex set. 
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Updates. At each stage t = 1,2,..., we invoke the procedure UPDATE(t) that is given in Algo- 
rithm 1. As mentioned earlier, the algorithm works by batching updates made to the graph G. In 
Line 3, the algorithm determines the current batch level 7 which in turn determines the batch size 
that has to be handled at the current stage t. Once j is determined, we also find the last stage tj—1 
that a level-(j7 — 1) update occurred. 

The algorithm then sets all graphs and embeddings of level j, j+ 1,..., L to the empty graphs/ 
embeddings in Line 6. 

It then forms the graph J which is the key object in this section. This graph is constructed 
by finding all edges e whose embedding path in Il<;(e) was affected by updates since the last 
recomputations of Io, I, ... , Ij—ı and adds some projection € of e onto the vertex set S;_; to J. 

The idea behind this projection is that as S;_1 is the set of vertices touched since this stage, 
we have that an affected edge e has some vertex of S;_; on its path. Assuming for the moment 
that the edge e itself is not incident to a vertex that was split since the recomputation stage, we 
essentially project the endpoints a,b of e to the nearest vertices a,b in Sj; on the old embedding 
path Il<;(e). Note in particular that by definition the path II<;(e)[a,@] and Il<;(e) fb, b] are then 
still in G. 

We give the following more formal definition that also defines a projection for the slightly more 
involved case where the edge e is incident to a vertex that splits over time. 


Definition 5.4 (Edge-Embedding Projection). For any 0 < j < L, embedding Il<;, set Sj—ı being 
the set of all vertices touched by updates to G since the last stage that I<; was modified, and edge 
e € E(G) such that I<j(e) N Sj-1 #0. Then, we let proj;_;(e) be a new edge € that is associated 
with e and has endpoints a and b being the closest vertices in Sj—ı to the endpoints of e in the 
current graph Il<;(e), respectively. Here Il<;(e) refers to the graph over the entire vertex set V(G) 
obtained from adding the edges that are on Il<;(e) and then applying the relevant updates that took 
place on G since I<; was last modified. 


As previously mentioned, we project these edges e whose embedding path I<;(e) was affected 
by updates onto S;_; to obtain edge € which is added to the graph J. Note that as projected edges 
are associated with edges e in G, the graph J can be a multigraph. 

Along with J, there is also a natural embedding of the projected edge ê into the sparsifier H<;: 
we can simply take the path Il<;(e)[@,a] 6e® I; (e)[b, b] which we already argued to exist in the 
current graph H<;. This embedding is constructed in Line 13. 

Finally, the procedure SPARSIFY is invoked on the graph J. While we have seen this procedure 
before in the initialization stage, here, we use a generalized version that incorporates the embedding 
from J into H constructed above. The guarantees of our generalized procedure SPARSIFY are 
summarized below. Note that letting J = G; and letting Hja be the identity function, we recover 
Theorem 5.3 as a corollary. 


Theorem 5.5. Given unweighted, undirected graphs H' and J with V(J) C V(H') and an embed- 
ding Ilj, from J into H'. Then, there is a randomized algorithm SPARSIFY(H', J, I jp) that 
returns a sparsifier J C J with |E(J)| = O(|V(J)|) and an embedding II ,__ y from J to J such that 


1. veong(II jy” oT, .7) < Ye: (veong(IIj47) + Amax(J)), and 
2. length jp’ © I 5%) < y -length(IIj_, 47). 
The algorithm runs in time O(|E(J)| +) and succeeds with probability at least 1—n~© for any 


constant C, specified before the procedure is invoked. 
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Algorithm 1: UPDATE(t) 


1 Update all sparsifiers Hp, H1,..., Hz with the t-th update if it applies. 

2 Add the vertices touched by the t-th update to each of the sets So, .$1,..., SL. 

3 j 4+ min{j’ € Z>o | t is divisible by n}-7/7}. // Determine j level of stage t. 
4 tj-1< |t/ni-U-D/| .nt=0-D/L. // t;_1 is the most recent level-(j —1) stage. 
5 for i= j,j+1,,..., L do // Re-set level-> j sparsifiers. 
6 i H; + (V, 0); Set I; to be an empty map; S; + @. 


// Auxiliary Graph and embedding by projecting embedding paths onto Sj-1- 

J (Sj-1, 9); Tj. jUE agected +ó. 

E affected — {e E E(G) with IT<;(e) M O71 F 0}. 

9 foreach edge e € Eaffecteq dO 

10 € + proj;_;(e). // Find Projected Edge of e. 


aon 


11 Let a and b be the endpoints of e, and @ and b be the endpoints of ê. 
12 Add € to J. 
13 | Wyn cjUBapectea(€) | H<j(e)[a@, a] 8 e @ I<j(e)|b, b]. 

// Sparsify Auxiliary Graph and translate back to re-build H. 
14 (J, I, y = SPARSIFY(H<; U Eaffected; J, ju <jUE 
15 foreach edge e € Eaffectea and € = proj;_1(e) € J do Add e to Hj. 

16 foreach edge e € Eaffected do 
17 é = proj; (e) 
18 Let a and b be the endpoints of e, and @ and b be the endpoints of é. 


A 


19 II, (e) <] I<;(e) la, @] ® AG ans? eee) are o M, (€) ® I<;(e) [b, b]. 
J). 


affected ) $ 


20 return D = par ue yal 


We defer the proof of the theorem to Section 5.2 and finish the description of Algorithm 1. 
Given the sparsified graph J (along with the embedding map), we find the pre-images of the edges 
in J and add them to H. We then re-embed all edges (a,b) in G that were no longer properly 
embedded into H by using the embedding Hj; © Wy to get from @ to b and then prepend 


(append) the path I, .;(e) from a to @ (from 6 to b). To gain better intuition for the resulting 
embedding, we recommend the reader to follow the analysis in Lemma 5.6. 


Proof of Theorem 5.1. As a first part of the analysis, we establish that the algorithm indeed 
maintains an actual embedding. 


Lemma 5.6. For any 0 < i < L, and stage t divisible by n'~‘/", i embeds G®) into HË. In 


particular, at any stage t, mË y = TS, embeds G® into H®, 


Proof. We prove by induction on the stage t. For the base case (t = 0), we observe that in the 
initialization phase Io embeds G) into Ho via the algorithm in Theorem 5.3. For all other j > 0, 


II; is an empty embedding and therefore, the claim follows. 

(t) 
t a 

edge e € E(G™). We first note that the path mY (e) in Ho exists since we can invoke the 

induction hypothesis by the fact that tj—ı < t which follows from the minimality of j (see Line 3). 

By definition of Sj—ı being the vertices touched by all updates to G since tj—1, we have that if 


Let us next consider the inductive step t— 11> t: We let j = j and tj-1 = t4. Consider any 
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m1 (e) N SoA = (), that 112) (e) still embeds e properly into H a Since the foreach-loop in Line 16 


is not entered for such edges e, we further have that TÉ} (e) = 11) (e). 


It remains to analyze the case TË) (e) N sh # @ where we note that € = projj—ı(e) is well- 


defined. We start with the observation that Hjy>H42,uz 
(t) 
<j A 
seen easily from its construction in Line 13. But H iY consists exactly of the pre-images of J as can 


be seen from Line 15, so each such embedding path is in H. Since IT joj Maps all edges € in J 
A F aX a t 

to paths in J, we thus have [lj 4- jUBapectea © IPE (8 in He 
t 


By the way the algorithm sets the new embedding path I; 
argue about the path segments TË (e) [a,a] and 112) (ep, b| where a,b are the endpoints of e and 


(é) restricted to the edges in J only 


affected 


maps to the edges in He’ and the pre-images of edges € € J under the proj j-1() map as can be 


(e) in Line 19, we thus only have to 


a,b are the endpoints of ê. But again, by definition of € (see Definition 5.4), we have that both of 


these path segments are contained in H si 


(t) 


i 


is set to be empty and therefore i = ie. 


Finally, for all i > j, II 


It turns out that the proof of Lemma 5.6 is already the most complicated part of the analysis. 
We next bound congestion and length of the embedding. 


Claim 5.7. For any 0 < i < L, we have vcong(I1®)) <4 A max( G). 
Proof. Again, we prove by induction on stage t. We have for i = 0, that 71) as computed in the 
initialization stage has vertex-congestion at most y-Amax(G) by Theorem 5.3. For i > 0, we have 


that uo is empty; therefore its congestion is 0. 


— 9 


For t— 1 > t, we define j = j® and tj—1 ;-1- Observe that for each edge e considered 


in the first foreach-loop starting in Line 9, yn. UB.pyectea(€) Consists only of the edges in nË) (e) 
and the edge e itself, it follows that every embedding path that contributes to vertex congestion of 


a vertex v in vcong(Il J> H-jUE ) also contributes to the vertex congestion of v in veong(I1,), 


affected 


and hence veong(ILj.4~ jug ) < veong(I1,). Further, we can see from the construction of the 


affected 
graph J that A(J) < veong(I1",). 


©), By minimality of j (see Line 3), we have tj—ı < t and we can 


Let us next analyze vcong(IIv j 


use the induction hypothesis to get veong (II!) < 4-!yJ Anax(G). It is further immediate to 
see that since the embedding I<; was not affected by any recomputations since stage tj—ı that the 
vertex congestion can only have dropped ever since. 


Thus, when the graph J is sparsified in Line 14, by Theorem 5.5, we can conclude 
vcong (Tj esgic o 1,7) <2. AI ITA max (G) : 


Finally, when we construct the embedding II; in Line 19, the path segments [ITj44~,u Eapected Ol 
II, _,;|(€) incur vertex congestion at most vcong(Ilj42;UEapectea © Hj ,7), and the path segments 
Il<;(e)[a, a] and I<;(e){b, b] incur total vertex congestion at most veong(II?). 


As congestion is additive, we can upper bound the total congestion of inh by (49-149 + 2- 


4)~lyJ+1) A max(G') and can finally use veong (II?) < vcong(II{”) + vcong(II®) < 4i tI A max(G). 
(t) F 


i 


For all i > j, we note that II 


is empty and therefore, vcong(II®)) < veong(II2?). 
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Claim 5.8. For any 0 < i < L and stage t divisible by n!~*/£, we have length(II2?) < yee ee 


Proof. We again take induction over time t. For t = 0, we note that Ip has length y by The- 
orem 5.3. For t— 1 > t, for j = j® and tj1 = ie 


mY (e) < 2i-14J. But note that when we set the path 1” (e) in Line 19, then the segments 


we have by induction hypothesis that 


TÉ (e) [a,a] and nË) (e), b] (combined) are of length at most 2/ —14/ because they survived from 


j 
me) by definition of Sj—ı. Further, the segment [jae jUPeyectea © Hz 7i (ê) has length at 


<J 
most y - length(IIJ—>H<;UEspetea) DY Theorem 5.5. But by construction in Line 13, the embedding 


Tye ;UEspectea DAS length at most length(II-» (e)) +1. 
Combining these insights, we have length (11) < Qi-ly} + (length (IL (e)) +1) <2)", 
For i > j, we have length(II<;) = length(I<;). 


Now that we established all properties of the embedding, it remains to analyze the sparsifier H. 


Lemma 5.9. At any stage, H consists of at most O(n) edges and the amortized number of changes 
to the edge set of H per update is O(n/£), D is of amortized size O(n Elfyn). Initialization 
time of the algorithm is O(my) and it has amortized update time O(n*/# (ye) Œ Amax(G)). 


Proof. The graph Ho is computed during initialization and remains fixed and therefore consists of 
O(n) edges by Theorem 5.3 and contributes no recourse. For each j > 0, H; is initially empty 
and only has edges added in stages t divisible by n!~J/" (but not by n'!~G-)/Z) in Line 15. In 
each such stage t, the graph J is formed over the vertices Sj—1. It is straight-forward to see by 
Definition 5.2 and Line 6 that Sj—ı is of size at most n'-G-D/L at any stage. Thus, when the 
graph J is computed, it consists of at most O(ni-G-D/ L) edges by Theorem 5.5. The bounds on 
overall sparsity of H follow. 

For the claim on the recourse, we note that in stages t divisible by n!~J/4 (but not by ne OO ), 
we recompute a spanner on the vertices Sj—ı which are a subset of the vertices in G and add 
O(n!-G-))/£) edges. For the graphs H for j! > j, the graphs are empty after the algorithm 
finishes. Using an inductive argument, we can argue that the number of edge deletions at stage 
t can also be upper bound by O(ni-G-H/2), Thus, it is not hard to see that at most O(n/£) 
amortized changes to the edge set of H are made. It remains to argue about a rather subtle detail: 
if the update is a vertex split applied to GË" to obtain G™, then we also need to account for the 
recourse caused by the vertex split to the graphs H oo for j’ < j. But note that we only pay in 
recourse cost for edges that are moved from a vertex v to a vertex v’ if the degree of v’ after the 
vertex split is at most half the degree of v’s degree. Thus, we can charge each edge that is moved 
this way. Further, if v’’s degree is then again increased by a factor of 3/2, we can further re-pay 
that cost of moving by charging the newly inserted edges. Following this charging scheme, we can 
argue that each edge can be charged to pay O(log(n)) on insertion and an additional O(log(n)) in 
recourse for the halving of degrees (after being recompensated if the degree goes up again). Since 
there are O(n) edges initially in H and at most O(ni/ L) new edges after each update appear, our 
recourse bound follows. 

To obtain the bound on D™, we first observe that for each path IL,(e’) constructed in Line 19, 


pon 


by induction over time, we can straight-forwardly establish that Il<;(e’)[a,@] and H<;(e’)[b, 6] are 


subpaths of 1) (el ) by the properties of set S;_;. Thus, the only edges e on any such II,(e’) 


not already on the path 1) (e’) are the edges in the subpath [IIJ 172 ;UFajectea ° Hj, (e’). But 


clearly, [y+ jUBagectea © Me z (e) C Wy sHejUkapectea(J)- It remains to use our bound on the 
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number of edges in J and the fact that the map Iy_,4 <jUE maps edges to paths of length 
7h) in H by Claim 5.8. 


For the running time, we use Theorem 5.3 for the initialization, and observe that each vertex in 


affected 


J as analyzed above has degree at most OW? A max(G)) as discussed in the proof of Claim 5.7. 
Thus, using Theorem 5.5 computing each sparsifier J of J only takes time O(|V(J Nara VIKA): 
By standard amortization arguments and the fact that the time to compute the sparsifier dominates 
the update time of UPDATE(t) asymptotically, the lemma follows. 


To complete the proof of Theorem 5.1, we only have to analyze the success probability, which is 
straight-forward as the only random event at each stage is the invocation of the procedure SPARSIFY. 
Thus, taking a simple union bound over these events at all stages gives the desired result. 


5.2 Implementing the Sparsification Procedure 


It remains to prove the procedure that statically sparsifies graphs. 


Theorem 5.5. Given unweighted, undirected graphs H' and J with V(J) C V(H") and an embed- 
ding Wj from J into H'. Then, there is a randomized algorithm Sparsiry(H’, J, Ilj) that 
returns a sparsifier J C J with |E(J)| = O(|V(J)|) and an embedding II ,__ y from J to J such that 


1. veong(Ilj4n oll, 5) < Ye: (veong(Hy+H") + Amax(J)), and 


2. length(IIj7_, p’ O Te) < VS length(II jg). 


The algorithm runs in time O(|E(J)| - yı) and succeeds with probability at least 1—n-~© for any 
constant C, specified before the procedure is invoked. 


Additional Tools. At a high level, the proof of Theorem 5.5 follows by performing an expander 
decomposition, uniformly subsampling each expander to produce a sparsifier, and then embedding 
each expander into its sparsifier by using a data structure for outputting short paths between 
vertices in decremental expanders. To formalize this, we start by surveying some tools on expander 
graphs. Recall the definiton of expanders. 


Definition 5.10 (Expander). Let G be an unweighted, undirected graph and ¢ € (0,1), then we 
say that G is a ¢-expander if for all@ AS CV, |Ec(S,V \ S)| > d- min{vole(S), vole(V \ S)}. 


We can further get a collection of expander decomposition with near uniform degrees in the 
expanders. The proof of this statement follows almost immediately from [SW19] and is therefore 
deferred to Appendix B.2. 


Theorem 5.11. Given an unweighted, undirected graph G, there is an algorithm DECOMPOSE(G) 
that computes an edge-disjoint partition of G into graphs Go, G1,...,Ge for £ = O(log n) such that 
for each 0 < i < £, |E(G;)| < 2'n and for each nontrivial connected component X of Gi, Gi[X] 
is a w-expander for Y = Q(1/log?(m)), and each x € X has degg, (x) > Y2. The algorithm runs 
correctly in time O(mlog'(m)), and succeeds with probability at least 1—n-© for any constant C, 
specified before the procedure is invoked. 


Further, we use the following result from [CS21]. Given a ¢-expander undergoing edge deletions 
the data structure below implicitly maintains a subset of the expander that still has large con- 
ductance using standard expander pruning techniques (see for example [NSW17; SW19]). Further 
on the subset of the graph that still has good conductance, it can output a path of length m°) 
between any pair of queried vertices. 
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Theorem 5.12 (see Theorem 3.9 in arXiv v1 in [CS21]). Given an unweighted, undirected graph G 
that is p-expander for some @ > 0. There is a deterministic data structure DS ErpPath that explicitly 


grows a monotonically increasing “forbidden” vertex subset Vc V(G) while handling the following 
operations: 


e DELETE(e): Deletes edge e from E(G) and then explicitly outputs a set of vertices that were 
added to V due to the edge deletion. 


e GETPATH(u,v): for any u,v € V(G) \ V returns a path consisting of at most YErpPath edges 
between u and v in the graph G[V(G) \ V]. Each path query can be implemented in time 


YExpPath; Where YErpPath = (log(m) /)?! log(m)) | The operation does not change the set V. 


nN 


The data structure ensures that after t edge deletions volgo) (V) < Yaeit/@ for some constant Ydel = 
O(1). The total update time taken by the data structure for initialization and over all deletions is 


O(|E(G) Yeap Path)- 


The Algorithm. We can now use these tools to give Algorithm 2 that implements the procedure 
SpaRSIFY(H’, J, jsp). 


Algorithm 2: SpARsiIFY(H’, J, jsp) 


Jo, J1,..., Je + DECOMPOSE(J). 
J (V0). 


foreach e € E(J) do I, xe) — 0. 


foreach i € [0,4] and connected component X in J; do 


/* Sample the edges that are added to the sparsifier J. */ 


5 px = min { eee i}. 

6 Construct graph J; x, by sampling each edge e € E(Ji[X]) independently with 
probability px j. 

7 | Add all edges in Jx; to J. 

/* Embed all edges in J;[X] into the sampled local graph Jyi. */ 
8 foreach e € Jx; do I, (e) © e. 


J>J 
while there exists an edge e € E(J[X;]) with TII ,_ z(e) = 0 do 


A U N e 


10 Japsp + a copy of Jy. 

11 Initialize DS rp Path On graph J, Apsp With parameter @ Ej w/4 maintaining set V. 
12 foreach e € E(Jx,;) do cong(e) + 0. 

13 while there exists an edge e € E(J[V \ VI) with Il, ,7(e) =0 do 

14 Let u and v be the endpoints of edge e. 

15 Meg e) + DS gop Path«GETPATH(u, v). 

16 foreach e € II ,_ z(e) do 

17 cong(e) + cong(e) + 1 

18 if cong(e) > r “Berean eet then 

19 | Remove edge e from Japsp via DS ErpPath: DELETE (e). 


20 return (J,I,_7) 
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The algorithm has two key steps: 


1. Sampling: The graph J is first decomposed via Theorem 5.11. Then, the algorithm iterates 
over -expanders J;|X] with near-uniform degrees. It is well-known that to obtain a sparsifier 


J. x, of such graphs, one can simply sample each edge with probability roughly TROT 


To obtain the final sparsifier J, we only have to take the union over all samples J Xa 


2. Embedding: We then proceed to find an embedding for edges in J;|X] into J xi. The sampled 
edges can be handled trivially by embedding them into themselves. To embed the remaining 
edges e € E(J) into J X,i, we exploit that J Jx, is an expander graph which allows us to employ 
the data structure from Theorem 5.12 on J. x, to query for a path between the endpoints of e in 
J xi efficiently. We further keep track of the congestion of each edge in J; ‘x, by our embedding 
and remove edges that are too congested (at least until we cannot embed anymore in any 
other way). 


We point out that Algorithm 2 in no way uses the embedding I j;_,y7. Still, we show that due to 
the structure given, we can tightly upper bound the congestion and length of the embedding given 
by the composition IT j_, 47 o II pag 
Proof of Theorem 5.5. We start by proving the following structural claim. For the rest of the 
section, we condition on the event that it holds for each relevant i and X. 


Claim 5.13. For each i, and connected component X in Ji, the corresponding sample Ix. satisfies 
for each S C X that dEr (S,X \ S)|/pxa < |E X \ S)| < 2|Ez (9, X \ S)|/px,‚ with 
probability at least 1 —n~2°, 

Proof. Since for i = 0, Ji[X] = J: x, and px; = 1, the claim is vacuously true. For i > 0, consider 
any cut (S, X \ S) in J;[X] and assume wlog k = |S| < |X \ S|. Since J;[X] is a y-expander, 
we have that |Ez (S, X \ S)| > wAmin(Ji[_X])|S| by Definition 5.10. The algorithm samples each 
such edge e into the sample J; ‘x,; independently with probability py;. Thus, E|E+ Fx (S,X \ S)| = 


|Ea(S,X \ S)|- px 2 48C log(m)|S|. 


Using a Chernoff bound as in Theorem 3.1 on the random variable Ez (S, X \ S)|, we can 


thus conclude that our claim is correct on the cut (S, X \ S) with probability at least 1 —2m7~4CF. 
Since there are at most (> ) < |X|?* cuts where the smaller side has exactly k vertices, we can 
finally use a union bound over all cuts to complete the proof. 


By the claim above, and the fact that each graph J;[X] (for i > 0) is a -expander by Theo- 
rem 5.11, we can conclude that each Jx,; is a ~/4-expander. 


Corollary 5.14. For anyi>0 and X as used in Algorithm 2, Jy; is a w/4-expander. 


This implies that our initializations of the data structure DS gop pat, in Line 11 are legal according 


to Theorem 5.12. Next, let us give an upper bound on the congestion of the embedding II ae 


Claim 5.15. For any i € [0,4] and X as used in Algorithm 2, we have econg(H,_,s|z(y,[x])) < 
Yxi = O ( epee log(m) 


TEF restricted to edges 
PX,i 
in J;[X]. 


) where TL, FELIX) denotes the embedding Il, 7 
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Proof. We first observe that only in the foreach loop iteration on i and X, can any congestion 
be added to edges in J. x, by the disjointness of the graphs J; (see Theorem 5.11). Further, in 
the particular iteration on i and X, up to the while-loop starting in Line 9, the congestion of 
Tl, ,7|E(Ui[X]) is at most 1 since only edges sampled into J x, are embedded into themselves. 

It is straight-forward to observe that the congestion of the partial embedding IT GT (restricted 
to E(J;[X])) throughout each iteration of the outer-while loop starting in Line 9 is increased by at 
most 7 as the algorithm track congestion of the current iteration explicitly and removes edges that 
are too congested in Line 19. It thus remains to bound the number of iterations of the outer-while 
loop starting in Line 9 by by O(log(m)). We can then conclude that the total congestion on the 
edges is at most O(7 log(m)). 

To bound this number of iterations, let us analyze a single outer while-loop iteration (starting 
at Line 9), and fix the end of such an iteration. Let t be the number of deletions processed by the 
data structure DS ErpPath throughout the iteration and V FINAL , the set V at the end of the while- 
loop iteration. Using that the while-loop terminates, we can further conclude that the only edges 
not embedded after the current iteration are those outside E(J;[X \ VF!N44]). By Theorem 5.12, 
vol (VFINAL) < Aygeit/w and therefore by Claim 5.13, we have |E(J;[X]) \ E(JiLX \ VF44})| < 


X,i 
volj, x(V FINAL) < ne But at the same time, we know that at least t - T/YErpPath edges have 
been embedded in the current while-loop iteration, since each edge embedding adds at most 7A» Path 
units to the total congestion. We conclude that each iteration, we embed at least a {fraction of 
the edges in J;[X] that where not embedded before the current while-loop iteration. It follows that 


there are at most O(log(m)) iterations, which establishes our claim. 


Claim 5.16. vcong(II 7,47 © IT, 
2c 


x) < Y: (veong (Ij) + Amax(J)) with probability at least 


l-n 


Proof. Let us fix any vertex v € V(H’). We define Ey = {e € E(J) | v € Iy4y7(e)} to be the 
edges in J whose embedding path contains v. By definition of vertex congestion for embeddings, 
|Ey| < veong (IT jy). 

Next, for each edge e € Ey, let e be in J; after the decomposition in Line 1 in the component 
X, we define the random variable 


yxi ife € E(J) 
Y, = 
0 otherwise 


Note that the random variables Y, are independent as edges are sampled independently at random 
into J in Line 6. Further, by Claim 5.15, we have for each edge e, econg(II_, , 7|z7;[x]),€) < Ye 
and thus vcong(Hjy,y oI, .7,v) < Veen, Ye- 

We will bound this sum using a Chernoff bound. We first observe that every variable Ye € [0, W] 
for W = YarYExpPathOmax(J) for some scalar Yar which follows from the definitions of yx; in 
Claim 5.15 and px, in Line 5. Across all the edge congestion variables Ye, we have a uniform 
bound pledge On the expectation given by E[Ye] = px,i ' yx, = Hedge- Therefore Ef] cr, Ye] < 
Hedge * vcong (II 7,7’). We can conclude by Theorem 3.1 that 


P 5 Ye < 24C log(n) -W + 2ueagevcong(I jsa) | > 1-— on. 
ec Ev 


We can now set Ye = 24C log(n):YExrpPath' Yvar + 2Hedge = (YEzp Path) O ® which is consistent with our 
requirements on ye. Finally, it remains to take a simple union bound over all vertices v € V(H’). 


40 


Claim 5.17. length(IIj.— © Hea) < yı: length(IIj_, 477). 


Proof. Consider any edge e € E(J). If e is sampled into J, then Il, ,;(e) = e, as can be seen 
in Line 8. Otherwise, II eG e) is of length at most YErpPath aS can be seen from Line 15 and 
Theorem 5.12. Setting y = Yz2pPatn thus ensures our claim. 


Combining Claim 5.16 and Claim 5.17, we have established the properties claimed in Theo- 
rem 5.5. The success probability follows by taking a straight-forward union bound over the events 
used in the analysis above and the success of Theorem 5.11. The run-time of the algorithm can 
be seen from inspecting Algorithm 2, Theorem 5.11, the fact that the while-loop in Line 9 runs 
at most O(log(n)) times for each iteration of the outer foreach-loop (established in the proof of 
Claim 5.15) and finally the run-time guarantees on the data structure in Theorem 5.12. 


6 Data Structure Chain 


The goal of Sections 6 to 8 is to build a data structure to dynamically maintain m?“)-approximate 
undirected minimum-ratio cycles under changing costs and lengths, i.e. for gradients g € R? and 
lengths £ € RE, return a (compactly represented) cycle A satisfying B' A = 0 and 


(g, A) ~o(1) (g, f) 
< m`? min . 24 
Lah A Nh 24) 


Our data structure does not work against fully adaptive adversaries. However, it works for updates 
coming from the IPM. We capture this notion with the following definition. 


Definition 6.1 (Hidden Stable-Flow Chasing Updates). Consider a dynamic graph G undergoing 
batches of updates U),...,U,... consisting of edge insertions/deletions and vertex splits. We 
say the sequences g),£, and U™ satisfy the hidden stable-flow chasing property if there are 
hidden dynamic circulations c and hidden dynamic upper bounds w) such that the following 
holds at all stages t: 


1. c és a circulation: Biwe® =0. 


2. w) upper bounds the length of ce: eP | < wl for alle € E(G™). 


3. For any edge e in the current graph G®, and any stage t! < t, if the edge e was already 


present in GH), ie. e c GO \ T US), then wP < 2w!?. 


4. Each entry of w®) and ® is quasipolynomially lower and upper-bounded: 


log w € [— log? m, log? m] and log 2 € [—log?™ m, log?® m] for all e € E(G™). 


Intuitively Definition 6.1 says that even while g“) and € change, there is a witness circulation 
c® that is fairly stable. More precisely, there is some upper bound w) on the coordinate-wise 
lengths of ce that increases by at most a factor of 2, except on edges that are explicitly updated. 
Interestingly, even though both ec and w) are hidden from the data structure, their existence is 
sufficient. 

The IPM guarantees in Section 4 can be connected to Definition 6.1 by setting c® = f* — f 
and w = 10 + |e(f) oc, where f is the current flow maintained by our algorithm. The 
guarantees of Definition 6.1 then hold by a combination of Lemmas 4.7, 4.9 and 4.10. This is 
formalized in Lemma 9.2 in Section 9. 
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Our main data structure dynamically maintains min-ratio cycles under hidden stable-flow chas- 
ing updates. 


Theorem 6.2 (Dynamic Min-Ratio Cycle with Hidden Stable-Flow Chasing Updates). There is a 
data structure that on a dynamic graph G® maintains a collection of s = O(log n)? spanning trees 
T,,T2,...,T; C G® ford= O(log!/8 m), and supports the following operations: 


. UppaTE(U® , g, a) : Update the gradients and lengths to g® and ®©). For the update to 
be supported, we require that U contains only edge insertions/deletions and gL and 
U) satisfy the hidden stable-flow chasing property (Definition 6.1) with hidden circulation 
c) and upper bounds w, and for a parameter a, 

(g®, ce) 
lw®ll 7 


e QUERY(): Return a tree T; fori € [|s] and a cycle A represented as m°®) paths on T; (specified 
by their endpoints and the tree index) and m°) explicitly given off-tree edges such that for 
k= exp(—0(log”/8 m-loglogm)), 


(g, A) 
LOA 7 


Ka. 


Over T stages the algorithm succeeds whp. with total runtime m°)(m + Q) for Q = Stel JU). 


To interpret Theorem 6.2, note that A = c would be a valid output by the guarantees in 
Definition 6.1, i.e. [Lea], < fwe] from Item 2. Thus the data structure guarantee can be 


interpreted as efficiently representing and returning a cycle whose quality is within a m°(® factor 
of ce. Eventually, we will add A to our flow efficiently by using link-cut trees. 

Section 6 focuses on introducing the general layout of the data structure, and is definition-heavy. 
Section 7 explains how to plug in the circulations c and upper bounds w in our data structure, and 
shows a weaker version of Theorem 6.2 in Theorem 7.1. We use the weaker Theorem 7.1 to show 
the full cycle-finding result Theorem 6.2 by defining a rebuilding game in Section 8. 


6.1 Dynamic Low-Stretch Decompositions (LSD) 


In the following subsections we describe the components of the data structure we maintain to show 
Theorem 6.2. At a high level, our data structure consists of d levels, each of which has approximately 
a factor of k = m!/¢ fewer edges than the previous level. The edge reduction is achieved in two 
parts. First, we reduce the number of vertices to O(m/ k) by maintaining a spanning forest F of G 
with O(m/k) connected components, and then recurse on G/F, the graph where each connected 
component of F in G is contracted to a single vertex. While G/F now has O(m/k) vertices, it 
still potentially has up to m edges, so we need to employ the dynamic sparsification procedure in 
Theorem 5.1 to reduce the number of edges to O(m/k). 
We start by defining a rooted spanning forest and its induced stretch. 


Definition 6.3 (Rooted Spanning Forest). A rooted spanning forest of a graph G = (V, E) is a 
forest F on V such that each connected component of F has a unique distinguished vertex known 
as the root. We denote the root of the connected component of a vertex v € V as root. 


42 


Variable Definition 
Lg Lengths and gradients on a dynamic graph GW 
ce), w® Hidden circulation & upper bounds with |@© o c®| < w™ (Definition 6.1) 
F Rooted spanning forest of G (Definition 6.3). 
p(F [u, v}) Path vector from u — v in a forest F 
stri” Stretch of edge e with respect to spanning forest F and lengths £ (Definition 6.4) 
stre Stretch overestimates stable under edge deletions (Lemma 6.5) 
C(G, F) Core graph from a spanning forest F (Definition 6.7) 
e Image of edge e € E(G) into the core graph C(G, F) 
S(G, F) Sparsified core graph S(G, F) C C(G, F) (Definition 6.9) 
Go, ..., Ga B-branching tree chain (Definition 6.10) 
Go, ..., Ga Tree chain (Definition 6.10) 
TCo Ga Tree in G corresponding to tree chain Go,...,Gq (Definition 6.11) 
TE Collection of B? trees on G from B-branching tree chain (Definition 6.11) 
prev”) Previous rebuild times of branching tree chain (Definition 6.12) 


Table 1: Important definitions and notation to describe the data structure. In general a (t) 
superscript is the corresponding object at time t of a sequence of updates. 


Definition 6.4 (Stretches of F). Given a rooted spanning forest F of a graph G = (V, E) with 
lengths £ € REg, the stretch of an edge e = (u,v) € E is given by 


er Fe det 1+ (£, |p(F[u, v])|) /le if root = root 
© 11+ (4 |p(Flu, root!])| + p(F[v,root#])|) /Le af root # root, 


where p(F|-,-]), as defined in Section 8, maps a path to its signed indicator vector. 


When F is a spanning tree Definition 6.4 coincides with the definition of stretch for a LSST. 

The goal of the remainder of this section is to give an algorithm to maintain a Low Stretch 
Decomposition (LSD) of a dynamic graph G. As a spanning forest decomposes a graph into vertex 
disjoint connected subgraphs, a LSD consists of a spanning forest F of low stretch. The algorithm 
produces stretch upper bounds that hold throughout all operations, and the number of connected 
components of F grows by amortized O(1) per update. At a high level, for any edge insertion or 
deletion, the algorithm will force both endpoints to become roots of some component of F. This 
way, any inserted edge will actually have stretch 1 because both endpoints are roots. 


Lemma 6.5 (Dynamic Low Stretch Decomposition). There is a deterministic algorithm with total 
runtime O(m) that on a graph G = (V, E) with lengths € € REg, weights v € RE, and parameter k, 
initializes a tree T spanning V, and a rooted spanning forest F C T, a edge-disjoint partition W of 
F into O(m/k) sub trees and stretch overestimates Stre. The algorithm maintains F decrementally 
against T batches of updates to G, say U,U®),...,U, such that Stre =] for any new edge e 
added by either edge insertions or vertex splits, and: 


1. F has initially O(m/k) connected components and O(qlog? n) more after t update batches of 
total encoding size q = Si Enc(U) satisfying q < Olm). 


2. strf < stre < O(kyLssr log n) for alle € E at all times, including inserted edges e. 


3. Deeply Vestre < O(lv|liyzssr log? n), where E is the initial edge set of G. 
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4. Initially, W contains O(m/k) subtrees. For any piece W € W,W C V, |OW| < 1 and 
volg(W \ R) < O(klog? n) at all times, where R D OW is the set of roots in F. Here, OW 
denotes the set of boundary vertices that are in multiple partition pieces. 


Intuitively, the first property says that F has O(m/k) roots initially and each update x adds 
O(ENC(2)) roots to it. For example, each edge update adds O(1) roots to F. This allows us to 
satisfy the second property, which is that the stretch of e with respect to F (Definition 6.4) is 
upper bounded by some global upper bound stre. Note that stre stays the same for any edge e 
across the execution of the algorithm. The third property says that these global upper bounds are 
still good on average with respect to the weights v up to O(1) factors. The final property is useful 
for interacting with our sparsifier in Theorem 5.1 whose runtime and congestion depend on the 
maximum degree of the input graphs. 

We defer the proof of Lemma 6.5 to Appendix B.3. 


6.2 Worst-Case Average Stretch via Multiplicative Weights 


By doing a multiplicative weights update procedure (MWU) on top of Lemma 6.5, we can build a 
distribution over partial spanning tree routings whose average stretch on every edge is O(1). This 
is very similar to MWUs done in works of [Rac08; KLOS14] for building Zæ oblivious routings, and 
cut approximators [Mad10; She13]. 


Lemma 6.6 (MWU). There is a deterministic algorithm that on a graph G = (V, E) with lengths £ 
and a positive integer k computes t spanning trees, rooted spanning forests, and stretch overestimates 
{(Tj, Fi C Tj, str) }4_, (Lemma 6.5) for some t = O(k) such that 


t . 
5 distr, < O(YLSST log? n) for alle € E, (25) 
i=1 


where A € RÉ, is the uniform distribution over the set [t], i.e. A = T/t. 
The algorithm runs in O(mk)-time. 


The proof is standard and deferred to Appendix B.4. 

If we sample a single tree/index from the distribution A, then any fixed flow will be stretched by 
O(yLssr log? n) on average. Hence any fixed flow will be stretched by O(yLssr log? n) by at least 
one of O(logn) trees sampled from A with high probability. We will leverage this fact to analyze 
how the witness circulation c in Definition 6.1 and Theorem 6.2 is stretched by a random forest. 


6.3 Sparsified Core Graphs and Path Embeddings 


Given a rooted spanning forest F, we will recursively process the graph G/F where each connected 
component of F is contracted to a single vertex represented by the root. We call this the core 
graph, and define the lengths and gradients on it as follows. Below, we should think of G as the 
result of edge insertions/deletions to an earlier graph G) so stre = 1 for edge inserted to get from 
GC) to G, as enforced in Lemma 6.5. 


Definition 6.7 (Core graph). Consider a tree T and a rooted spanning forest E(F) C E(T) 
on a graph G equipped with stretch overestimates stre satisfying the guarantees of Lemma 6.6. 
We define the core graph C(G, F) as a graph with the same edge and vertex set as G/F. For 
e = (u,v) € E(G) with image € € E(G/F) we define its length as pe) | Strele and gradient as 


gO) & ge + (g, p(T lv, ul). 


e 


44 


Remark 6.8. In our usage, we maintain C(G, F) where G is a dynamic graph and F is a decre- 
mental rooted spanning forest. In particular, T, F, and str are initialized and maintained via 
Lemma 6.5. As G undergoes dynamic updates such as edge deletions or vertex splits which adds 
new vertices to G, T won’t be a spanning tree of G anymore. Definition 6.7 responds to such 
situation by allowing T not being a spanning tree nor a subgraph of G. 

Thus, for e = (u,v) € E(G), u and v may not be connected in T. In this case, the value of 
a is simply ge. Also, the support of the gradient vector g is E(G)U E(T). This corresponds 
to the case when some edge in T is removed from G, we keep the gradient on that edge as it is. 


Note that the length and gradient of the image of an edge e € E(G) in Definition 6.7 do 
not change under edge deletions to F, because they are defined with respect to the tree T. This 
important property will be useful later in efficiently maintaining a sparsifier of the core graph, 
which we require to reduce the number of edges in the sparsified core graph to O(m/ k). 


Definition 6.9 (Sparsified core graph). Given a graph G, forest F, and parameter k, define a 
(Ys; Yc, Ye)-sparsified core graph with embedding as a subgraph S(G, F) C C(G, F) and embedding 
Ie(a,F)+sS(G,F) satisfying 


1. For any € € E(C(G, F)), all edges & € Ieig,r)-.s(G,r)(€) satisfy p oe Ro er, 


2. length(IIe¢¢,r)+s(¢,F)) < "1 and econg(IIe(g,7)+5(G,F)) < kye. 
3. S(G, F) has at most mys/k edges. 


4. The lengths and gradients of edges in S(G, F) are the same as in C(G, F) (Definition 6.7). 


In Section 7 we give a dynamic algorithm for maintaining a sparsified core graph of a graph G 
undergoing edge insertions and deletions. We defer the formal statement to Lemma 7.8 where we 
not only maintain a sparsified core graph but also show that the witness circulation ce and upper 
bounds w) from Definition 6.1 are preserved approximately. 


6.4 Full Data Structure Chain 


Our data structure has d levels. The graphs at the i-th level have about m/k’ edges, and each such 
graph branches into O(log n) graphs sampled from the distribution A from Lemma 6.6. 


Definition 6.10 (Branching Tree-Chain). For a graph G, parameter k, and branching factor B, 


a B-branching tree-chain consists of collections of graphs {Gi}o<i<a, such that Go E {G}, and we 
define G; inductively as follows, 


1. For each Gi € Gi, i < d, we have a collection of B trees TC: = {T,,T2,...,Tp} and a 
collection of B forests F® = {F\, Fz, ..., Fg} such that E(F;) C E(T;) satisfy the conditions 
of Lemma 6.5. 


2. For each G; € Gi, and F € F@, we maintain (Ys, Ye, y1)-sparsified core graphs and embeddings 
S(Gi, F) and lea,,7)+8(Gi,F)- 
3. We let Gis = {S(G;, F) : Gi € Gi, F € FM}. 


Finally, for each Gq € Ga, we maintain a low-stretch tree F. 

We let a tree-chain be a single sequence of graphs Go, Gi,...,Ga such that Gig is the (Ys, Ye, W)- 
sparsified core graph S(G;, F;) with embedding Uc(g,,r,)+8(G;,F,) for some F; € F& for0<i<d, 
and a low-stretch tree Fy on Gq. 
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In general, we will have B = O(log n) throughout, and we will omit B when discussing branching 
tree-chains. Note that level į of a branching tree-chain, i.e. the collection of graphs in G;, has at 
most Bi = O(logn)' graphs for B = O(logn). A branching tree-chain can alternatively be viewed 
as a set of O(log n)? tree-chains, each of which naturally corresponds to a spanning tree of the top 
level graph G. 


Definition 6.11 (Trees from Tree-Chains). Given a graph G and tree-chain Go, G1,...,Gq where 
Go = G, define the corresponding spanning tree TO% C1»--CGa g Ua F; of G as the union of 
preimages of edges of F; in G = Go. 

Define the set of trees corresponding to a branching tree-chain of graph G as the union of 
TCoG1-.Ga oyer all tree-chains Go, Gi, ..., Ga where Go = G: 


TE E {CCCa ; Go, Gy,...,Gq st. Gipi = S(Gy, F;) for all 0 < i < d} 


We can dynamically maintain a branching tree-chain such that we rebuild Gi+1 from G; every 
approximately m/k’ updates. Between rebuilds, the trees T of graphs G” € Gi stay the same, 
while the forests in F@ are decremental as guaranteed in Lemma 6.5. 


Definition 6.12 (Previous Rebuild Times). Given a dynamic graph G® with updates indexed by 
times t = 0,1,... and corresponding dynamic branching tree-chain (Definition 6.10), we say that 
nonnegative integers prev”) < prev) L- < prev) = t are previous rebuild times if prev") was 
the most recent time at or before t that Gi was rebuilt, i.e. for G € Gi the set of trees TE was 


reinitialized and sampled. 


We will assume that our algorithm rebuilds all G; € G; at the same time: if we recompute a set 
of trees T° for some G; € Gi, then we also recompute the trees T% for all other G’. € Gi. In the 
following Section 7 we show Theorem 7.1, we gives a data structure whose guarantee is weaker than 
Theorem 6.2. Precisely, the quality of the cycle returned depends on the previous rebuild times. 
We later boost this to an algorithm for Theorem 6.2 by solving a rebuilding game in Section 8. 


7 Routings and Cycle Quality Bounds 


The goal of this section is to explain how to route the witness circulations ec and length upper 
bounds w") through the branching tree-chain, and eventually recover an approximately optimal 
flow A. The main theorem we show in this section is the following. 


Theorem 7.1. Let G = (V, E) be a dynamic graph undergoing T batches of updates U,...,U™ 
containing only edge insertions/deletions with edge gradient g™ and length © such that the update 
sequence satisfies the hidden stable-flow chasing property (Definition 6.1) with hidden dynamic 
circulation c® and width w. There is an algorithm on G that maintains a O(log n)-branching 
tree chain corresponding to s = O(logn)@ trees T,,T>,...,T (Definition 6.11), and at stage t 


outputs a circulation A represented by exp(O(log’/® mlog log m)) off-tree edges and paths on some 
T; i € [s]. 

The output circulation A satisfies B' A = 0 and for some Kk = exp(—O(log’/® m log log m)) 

(g, A) (g, cO) 
2 o Al, T St w 

where prev”), i € |d] are the previous rebuild times (Definition 6.12) for the branching tree chain. 

The algorithm succeeds w.h.p. with total runtime (m+ Q)m for Q = = ju < poly(n). 
Also, levels i,i+1,...,d of the branching tree chain can be rebuilt at any point in mito) / Kt time. 
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(t) (t) 


The final sentence about rebuilding levels i, i +1,...,d allows us 1 “a prev; = prev;/) = 


= prev’) = t. This is necessary because it is possible that ||w‘ prev," Diha is much larger than 


(t) c) g (t) et) 
= pe, being much more than So 
This is not sufficient to show Definition 6.1, which only guarantees that the latter quality is at 
most —a, but does not assume a bound on the former. We will resolve this issue in Section 8 by 
carefully using our ability to rebuild levels 7,i+1,...,d poaae whenever the cycle A returned 


lj) ||; for some 0 < i < d. This could result in 


by Theorem 7.1 is not good enough, and we deduce that ||w' prev,”) lı is much larger than ||w® jı 
for some 0 < i < d. 


7.1 Passing Circulations and Length Upper Bounds Through a Tree-Chain 


Towards proving Theorem 7.1 we define how to pass the witness circulation c and length upper 
bounds w downwards in a tree-chain. It is convenient to define a valid pair of c,w with respect 
to a graph G with lengths @. Essentially, this means that c is indeed a circulation and w are valid 
length upper bounds, i.e. items 1 and 2 of the hidden stable-flow chasing property Definition 6.1. 


Definition 7.2 (Valid pair). For a graph G = (V, E) with lengths € € RE, we say that c, w € RË 
are a valid pair if c is a circulation and |€.ce| < we for alle € E. 


7.1.1 Passing Circulations and Length Upper Bounds to the Core Graph 
We first describe how to pass c, w from G to a core graph C(G, F) (Definition 6.7). 


Definition 7.3 (Passing c,w to core graph). Given a graph G = (V, FE) with a tree T, arooted 
spanning forest E(F) C E(T), and a stretch overestimates Stre as in Lemma 6.5, circulation c € RË 


and length upper bounds w € Ros we define vectors &(GF) e REC(GF)) and w'GF) e Ry ue (Ge) 
as follows. For €€ E(C(G,F)) with preimage e € E, define cx ai F) af ee and wx CGE) E tree. 


We verify that c°(G-") is a circulation on C(G, F) and that w°(@) are length upper bounds. 


Lemma 7.4 (Validity of Definition 7.3). Let c, w be a valid pair (Definition 7.2) on a graph G with 
lengths £. As defined in Definition 7.8, C&G), wC(CF) are a valid pair on C(G, F) with lengths 
KGF) (Definition 6.7), and 

jeens E Fer 


ec E(G 


Proof. The proof is primarily checking the definitions. Recall that the edge set of C(G, F) is G/F. 
Contracting vertices preserves circulations, hence c°(@-") is a circulation (as c is). 
In Definition 6.7 we define pe = Strele. So 


C(G,F) 


Aer ee F) = str.|€.c>| = Stre|leCe| < strewe = we , 


where the inequality holds because c, w are a valid pair. 
The bound on ||w°(¢)||; follows trivially by definition. The reason for the inequality (instead 
of equality) is that some edges may be contracted and disappear. 


Finally we state an algorithm which takes hidden stable-flow chasing updates on a dynamic 
graph G™ and produces a dynamic core graph. Below, we let c® CCF) w:(G-P) denote the result 
of using Definition 7.3 for c = c and w = w™, and similar definitions for g CCCF) g@).C(G,F) 
used later in the section. 


AT 


Lemma 7.5 (Dynamic Core Graphs). Algorithm 3 takes as input a parameter k, a dynamic graph 
G® undergoes T batches of updates U,...U with gradients g™, and lengths € at stage t = 
0,...,7 that satisfies ~7_, EncC(U) < m/(klog?n) and the hidden stable-flow chasing property 
with the hidden circulation c® and width w. 

For each j € [B] with B = O(logn), the algorithm maintains a static tree Tj, a decremental 


rooted forest a with O(m/k) components satisfying the conditions of Lemma 6.5, and a core 


graph c(G®, ZY) : 


1. Core Graphs have Bounded Recourse: the algorithm outputs update batches ue that produce 
cae”, F) from cer pr such that X y< ie =0 (Ers Enc(U“?) - log? n) : 


2. The Widths on the Core Graphs are Small: whp. there is an j* € [B] only depending on w® 


(t) 
such that for wie .F) gs defined in Definition 7.3 for E(F;) C E(T;), and all stages 
te {0,...,T}, 


(t) p(t) 
oF i < Olyrssr log? n)||w + Jw Ihr. (26) 


The algorithm runs O(mk)-time. 


Algorithm 3: Dynamically maintains a core graph (Definition 6.7). Procedure INITIALIZE 
initializes all variables, and DyYNAMICCORE takes updates to G®. 

1 global variables 

2 B + O(log n): number of instantiations of Lemma 6.5 

3 Ae) for j € [B]: algorithms implementing Lemma 6.5 

4 T; for j € [B]: trees initialized by Ae 


5 str’ for j € [B]: stretch overestimates initialized by A 


(LSD) 
j . 


6 procedure INITIALIZE(G = G©), £, k) 

Let A and {T],...,7/} be returned by Lemma 6.6 on G with lengths £ and t = O(k) 
For j € [B] sample i; € [t] proportional to A, and T; + T;, 

ESD) on T; for j € [B] 

10 procedure DynAmicCorE(G™, UM, ge) 

11 for j € [B] do 


9 Initialize A 


12 Pass U“ to Ae which updates a to F 
// All edges ece GY N (Uis u) have str’ =1 by Lemma 6.5 
13 Let uy? be the batch of vertex splits that updates C(G)), p9) to 
ce an), 
14 | Append gM with U which updates c(Ge-), FM) to C(GM, ny, 


Algorithm 3 initializes B trees T,,...,7_ from the MWU distribution output by Lemma 6.6. 
For each of these B trees, we maintain a forest E(F® ) C E(Tj) satisfying the conditions of 
Lemma 6.5 with the goal of forcing the stretch of every newly appeared edge e in G® to be 1, i.e. 
str, = 1 for all j € [B]. 
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Given an update batch U™, Algorithm 3 first updates the forest F TR to F H for any j € [B] 
using the algorithm of Lemma 6.5. For any update « € U“), if z updates some edge e, both 
endpoints are roots in the forest F i and they appear in the core graph. If x splits some vertex u, 
u is made a root in the forest and it appears in the core graph as well. In both cases, the update 
z can be performed in the core graph C(G®-), r” (notice that it is not C(G®, PY), Thus, we 
can apply the entire batch U® to produce c(G®, FE) from C(GC-), A”). 

When a vertex u € V(G“-)) is split, the algorithm of Lemma 6.5 treats it as a sequence of one 
isolated vertex insertion and O(degaw (uN )) = O(ENC(z)) edge insertions/deletions. The newly 
added isolated vertex stays isolated in the forests Fj, j € [B] as they are maintained decrementally 
edge-wise. 

We adapt the reduction when applying U to produce C(G™, a) from c(Ge-), FM). Thus, 
the number of updates in the core graph is at least the total encoding size of updates in the original 
graph. As we will show, it is upper-bounded by the total encoding size as well. 


Proof of Lemma 7.5. Note that Xref] Enc(U“) = m/(klog? n), we can take q = O(m/(k log? n)) 
in Lemma 6.5. Thus by item 1 of Lemma 6.5, pe has O(m/k) connected components. 

Next, we prove Item 1 that bounds the number of updates to the core graph. After t batches of 
updates U™,...,U, A increases the number of components in F} by O (Ers Enc(U) - log? n) 
according to Item 1 of Lemma 6.5. Every new components appeared in Fj splits a vertex in the core 
graph C(G, Fj). Thus, there will be O ar Enc(U“)) - log? n) vertex splits happened in the core 
graph. After updating Fj, every update batch UČ) updates C (G, F;) as it updates G. Thus, we 
can bound the number of updates to C(G, Fj) up to first t stages by O (Erz Enc(U)) - log? n). 

To show Item 2, by Lemma 7.4 we first get that 


(t) p(t) ; i j 
1 ec E(G()) eeG\(UE_, u(s)) eeGOn(Ui_, u()) 
(ii) Per 
< 2 5 Siriw + w| i (27) 
e€G(9) i 


where (i) follows because every edge e appeared in G™ due to some update in some U™ has 
str’ = 1 by a condition of Lemma 6.5. (ii) follows because the hidden stable-flow chasing property 
(Definition 6.1 item 3) gives that any edge e € GY \ Ut- U c GO has wo) < ow), 

Now recall that T; is sampled from the collection {T}, ...,T{} of trees given by Lemma 6.6, 
with probabilities proportional to A. Hence 


in| X straw] = Z wl) Z Asire < Onssr log? n)||w lh, 
e€G() e€ G0) i=j 


by the guarantees of Lemma 6.6, so by Markov’s inequality 
Pr| > strw{ < Olrssrlog’n)ljw®l] > 1/2. 
7“ e€eG) 


Since we sample B independent trees T, for B = O(log n), we get that there exists an i* satisfying 
(26) with probability at least 1 — 278 > 1 — n PUY, 
Finally, the algorithm runs in total time O(mk) by Lemma 6.6 and Lemma 6.5. 
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7.1.2 Passing Circulations and Length Upper Bounds to the Sparsified Core Graph 


C(G.F) ayC(G.F) 


We describe how to pass c on a core graph to a sparsified core graph S(G, F). 


Definition 7.6 (Passing c,w to sparsified core graph). Consider a graph G with spanning forest 
F, and circulation &@F) e RECGF)) and upper bound wF) e RECGF) 


Te(a,r)-+8(¢.F) for a (Ys, Ye, N)-sparsified core graph S(G, F) © C(G, F). Define 


, and embedding 


SODS E SOM G.r) 86,7) ©) (28) 
ee E(C(G,F)) 
woiGF) =92 5 ip [Tec nsen ®]| ' 1a) 


e€€ E(C(G,F)) 
We check that c5(@) is a circulation on S (G, F) and wS(CF) are length upper bounds. 


Lemma 7.7 (Validity of Definition 7.6). Let KG) wile) be a valid pair on graph C(G, F) 
with lengths (CF). As defined in Definition 7.6, cS(@-F), wS(CF) is a valid pair on S(G, F) with 
lengths S(CF) (Definition 6.9). Also, 


Jw < [fw IL, < O(a) |W IL. 


Proof. Let Bs, Bc be the incidence matrices of S(G, F),C(G, F) respectively. To see that e5(¢-") 
is a circulation, we write 


BSE = E BOMB Meemasen@= YD AOb BLP =0. 
e€ E(C(G,F)) ee E(C(G,F)) 


To see that w°‘@) are valid upper bounds, for all # € E(S(G, F)) 


S(G,F)_S(GF)) _ S(G,F) c(a,F)| © S(G,F) _C(G,F) 
BSOP E s2: So O EOngOAn] 
ee Elle(a,F)-+8(G,F) (€) ee Elle(a,F)-+8(G,F) (e) 


exe’ €lle(g,r)-+8(G,F) ©) 


Throughout, we used several properties guaranteed in Definition 6.9, and (i) specifically follows by 
item 1. The final equality follows by the definition of wS(@") in (29). 
Finally we upper-bound |/w°(¢”) ||, by 


Jw E wl eeren Olli 
ec E(C(G,F)) 
< 2\|w°F") || length(Mee,r)+8(¢,r)) < O(m) woo |I1, 


because S(G, F) is a (Ys, Ye, 1) sparsified core graph. ||w°(G") ||, < ||wS(G)||, follows directly 
from the definition. 


We can now give an algorithm that takes a dynamic graph G®) undergoing hidden stable-flow 
chasing updates, and maintain a sparsified core graph also undergoing hidden stable-flow chasing 
updates, such that the total size of updates increases by a factor of at most m°). This shows how 
to pass from level i to i+ 1 in a tree-chain (Definition 6.10). 
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Lemma 7.8 (Dynamic Sparsified Core Graphs). Algorithm 4 takes as input a parameter k, a 
dynamic graph G®) undergoes T batches of updates U),...,U™ with gradients g™, lengths £ at 
stage t = 0,...,7 that satisfies 7_, ENC(U) < m/(klog? n) and the hidden stable-flow chasing 
property with the hidden circulation c® and width w®. 


(t) 


The algorithm maintains for each j € [B] (for B = O(logn)), a decremental forest F)”, a static 


tree Tj satisfying the conditions of Lemma 6.5, and a (Ys, Yi, Ye)-sparsified core graph S(G®, PM) 


j 
for parameters Ys = Ye = Jı = exp(O(log?/4 mloglogm)) with embedding II 


(GO, FO), FOV 

1. Sparsified Core Graphs have Low Recourse: the algorithm outputs update batches ve that 
produce (GM, a) from 8(Ge-Y, ay such that XY y< Enc(US?) = fruct Enc(U)) 
for some yp = exp(O(log?/4 mlog log m)), 


2. Sparsified Core Graphs undergo Hidden Stable-Flow Chasing Updates: for each j € |B], the 
(t), 6(GO RF) 


update batches U © to the sparsified core graph along with the associated gradients g 
S,j 
t 
and lengths gO SGE] J 


property (see Definition 6.1) with the hidden circulation c 
as defined in Definition 7.6, and 


as defined in Definition 6.9 satisfy the hidden stable-flow chasing 
(),5(GO.F;°) and width wt)S(,F;”) 


3. The Widths on the Sparsified Core Graphs are Small: for each j € |B], the width on the spar- 
sified core graph S(GORY) is bounded as follows: 


< OC) (1w + Jew], 
1 1 


[aoe 


Also, whp. there is an j* € |B] only depending on w® such that 


(t) plt) 
[wos oe 


, $8 (le, + oI) (30) 


The algorithm runs in total time O(mk - yr). 


Algorithm 4 essentially maintains the sparsified core graphs S(G), F o by passing the core 
graphs C(G®, F i into the dynamic spanner Theorem 5.1. Intuitively, because F a is decremental, 


the graph C (G®, FM) changes by undergoing vertex splits, plus additional edge insertions and 
deletions induced by the update batch U™. 

Similar to Lemma 6.5, Algorithm 4 treats each update batch U® to G as O(ENC(U™)) edge 
insertions/deletions and isolated vertex insertions. In particular, for any update x € U () that splits 
a vertex u € Gt-), it is treated as an update sequence of inserting one isolated vertex wN=” and 
then deleting/inserting deggu) (uNFW) edges. 


However, each edge insertion /deletion causes O(1) vertex splits in the core graph C(GM, F ~ 


As vertices in c(G®, FM) could have degree 2(k), we cannot afford treating vertex splits in the 
core graph as a sequence of edge insertions/deletions. This would represent S (GO,F uy using 
updates of total encoding size O(k - X, Enc(U®)) = O(m) instead of O(m!+°™) /k). Using the 
dynamic spanner of Theorem 5.1 resolves the issue as it handles vertex splits with low recourse. 
In particular, S (Gu, F A can be represented using a sequence of updates with total encoding size 
O(m ® . £, Enc(U®)). 
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Algorithm 4: Dynamically maintains a sparsified core graph (Definition 6.7). Procedure 
INITIALIZE initializes all variables, and DYNAMICSPARSECORE takes updates to G®). 


1 global variables 

2 B + O(log n): number of instantiations of Lemma 6.5 in Lemma 7.5 
3 A(@ere): algorithm implementing Lemma 7.5 

4 A” anner) for j € [B]: algorithms implementing Theorem 5.1 


procedure INITIALIZE(G = G®), £, k) 

A(Core) INITIALIZE(G, £, k) 

for j € [B] do 

Let W be the partition in Lemma 6.5 item 4, and R D W an initial set of roots 
obtained from running the algorithm in Lemma 6.5. 

9 Create graph C; by splitting vertices of C(G, Fj) into vertices ur for each r € R, and 

a vertex uw for the set of vertices W \ R for each W € W. // The vertices 

Ur will not be split further, and uw all have degree at most O(k). 

Also, a deletion to F} will only split a single vertex. 


10 Let A; = Ne _.c(G,F) be the bijection between Æ (c) and E (C(G, F;)). 
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11 Initialize A‘ on Cj, the split version of C(G, Fj). 
12 Let S; be the spanner maintained by A PO and S(G, Fj) = Aj (S;). 


13 procedure DyNAMICSPARSECORE(G™,U®, g®, eM) 
14 | ACT) DynamicCorE(GM, UO, g, a) 
15 for j € [B] do 


16 Let g be the update batch that produce C(G™, E from C(GU-)), A), 

17 Let U ad (a g” contain all edge insertions. 

18 Let g = ue contain the rest. 

19 Update la to Sas with AG) using A en, 

20 Update se to 5 via inserting edges of A) directly. 

21 Let R CE (8) be the re-embedded set output by A” anner), 

22 Let Us be the corresponding update batch that produce S(GO, P from 

ser) a). 

23 Append oS with A,;(R;) and output a // Despite edges in A;(R;) remain 
unchanged in S(GO, BY) » we force re-insertions on them in the 
output batch of updates. 


Formalizing this approach requires discussion of several technical points. First, we cannot 
simply maintain the spanner of c(a®, FM) using Theorem 5.1 which does not support edge 
insertions. Instead of modifying the dynamic spanner algorithm, we deal with edge insertions 


naively by inserting each of them to S(GM, F T: As the total number of edge insertion is at most 
Xref] Enc(U®) = o(m/k), S(G®, EM is still sparse enough. 
Second, vertices in core graphs c(GO, FM), j € [B] might have degree Q(k). To ensure a 
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maximum degree bound of O(k) which is required by Theorem 5.1, we artificially split vertices 
in C (GY, F®) to create a graph ce on which we maintain the spanner. Precisely, we create a 


new vertex uw in GP for each piece W in the partition W of the forest F ar and a vertex ur for 
each root in the initial forest F we Throughout the execution, we ensure that every vertex of ce 
is either u, for some r being a root in the current forest, or ux for some connected component 
X CW of an initial piece W € W. In the former case, uy corresponds to a single vertex in the 


original graph G® and thus it is never split due to edge removals from the forest F m, In the later 


case, ux corresponds to the set of vertices X \ R and thus its degree is bounded by O(k) due to 4 
of Lemma 6.5. 


Proof of Lemma 7.8. We first argue that the graph C; for all j € [B] has maximum degree O(k) 
and O(m/k) vertices, and undergoes a total of O(m/k) vertex splits, and edge insertions/deletions. 
This shows that the application of the dynamic spanner algorithm in Theorem 5.1 is efficient. 

For any j € [B], each vertex of C; is either ur for some root r € R or ux for some connected 
component X of an initial piece W € W. In the case of ur, it will not be split further. In the case of 
ux, it corresponds to the set of vertices X \ R. Since X is a connected component in F of an initial 
piece W € W, the degree of ux is at most degg(W \ R) which is O(k) due to 4 of Lemma 6.5. 

The data structure implementing Lemma 6.5 inside A‘©°"®) ensures that Fj is decremental. Edge 
deletions in F; does not affect either gradients nor lengths of edges in C(G, F;) (Definition 6.7). 
Thus, one edge deletion in F} corresponds to only a single vertex split in C(G, Fj). The total number 
of vertex splits happened to C(G, Fj) can be bounded by the number of edge removals in Fj. The 
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number is O(m/k + qlog?n) = O(m/k) for q = X7 Enc(U®) < m/(klog?n) by item 1 of 
Lemma 6.5. Similarly, each edge deletion to Fj causes one vertex split in Cc. To see this, first note 
that no root vertices urp € C; are ever split. For the deletion of an edge e to F m, let W € W be 
the partition piece containing e. The vertex uw may have been split further already, so let e be 
currently inside the connected component X C W. Now, because OW C R at all times, we get 
that only ux was split in Ci; as desired. 


After updating F a and the enlarged vertex set of C(G®, F i we process every update of 


U,U™ naively as O(q) edge updates. As each edge update to C(G®, F;) corresponds to one edge 
update to C;, the number of edge updates happened to C; is also O(q) = O(m/(k log? n)). It remains 
to bound the initial number of vertices in C; by O(m/k). As noted in Algorithm 4, there is one 
vertex per root of the initial forest Fj and one vertex per cluster of the partition W. The number of 
roots initially is O(m/k) (Lemma 6.5). The number of clusters in W is also O(m/k) (Lemma 6.5). 
Thus, C; has O(m/k) vertices initially. 

As noted in Algorithm 4, it is at most twice the initial number of roots in F; which is O(m/k). 


Bounding the total size of U = (Item 1): Fix some j € [B]. As discussed above, processing 


all updates in the data structure of Lemma 6.5 causes at most O(m/k + >; Enc(U™) log? m) = 
O(m/k) vertex splits to C;. So, the graph C; undergoes at most O(m/k) vertex splits and edge 
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insertions/deletions. By item 3 of Theorem 5.1, the data structure A! outputs the re- 


embedded set RY of amortized size at most Jr, by taking L = (logm)!/4 in Theorem 5.1. Thus, 
the total size of re-eembedded edges >>, RP] is bounded by O(myr/k). Similarly, Theorem 5.1 also 
shows that S(G, F i) are (Ys, Ye, 1) Sparsified core graphs with the embeddings i, 

Now, we move towards checking the remaining conditions: showing the hidden stable-flow 


chasing property of the outputs on S (GO, FM) for all j € [B], and (30). For simplicity, we use 
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m” to denote the embedding II throughout the remainder of this proof. 


c(G®,F)+8(GO,FO) 


(t) (t) 
cE) SGO FF") (E), S(G®, F3”) 


Showing hidden stable-flow chasing property (Item 2): and w 
form a valid pair by Lemma 7.7. Therefore, items 1 and 2 of Definition 6.1 are satisfied. 

Next, we prove item 3 of Definition 6.1. At any stage t € [7] and any edge e € S(GM, F “j for 
some j € [B], suppose e also appears in an earlier stage t’, i.e. e € S(G™), R) for some t < t. 
e is not included in any of an s € (t',t]. Thus, we have e C wao otherwise e is 


included in some U, s € (t',t] due to the definition of re-embedded set (Item 3 of Theorem 5.1). 


For any edge e’ € i)", it exists in the core graph at both stage t and t, ie. œ € 
c(G®, FM) and cern), Let e'f be its pre-image in G. e’? also exists in G at both stage t 
and t. Since G is undergoing hidden stable-flow chasing updates, by item 3 of Definition 6.1 we 
have 
t), G® 
1G 


A 


E), GE 
é < 2 R WG . 


Definition 7.3 and the immutable nature of str from Lemma 6.5 yields 


/ 
CGF) THe w,ct Tje (1),G@) (CGO FP) 
w 1G G = 2 -w ` 


) 
<2- Str ia Wi e! (31) 


Combining with the fact that ta = (mS) -*(e) and Definition 7.6 yields the following 
and proves item 3 of Definition 6.1: 


t),S(GO Fo t),c(GO, FO 
SOF) a, 3 pO) 
1 
ee(n) (e) 
, an) pE) 
< 2 é 2 é 5 wt ),C(G A ) 
ee(n) (e) 
<2.2. 5 ores £ =2. yt SCO FO ) 


t) p(t) t) pt) 
Item 4 follows directly from the definition of LOS FS”) and wt) SG Fy ) 


( 
Upper-bounding jv) SCOED (Item 3): For any i, Lemma 7.7 yields 


[woseo 


< O(m) 
1 


heee 


1 
Lemma 7.5 gives that there is an i* € [B] such that for all t, 


< Õlyrssr log? njw + Iw Iz. 


peca 
1 


Combining these gives the desired bound. 
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Runtime: Time spent on the data structure implementing Lemma 6.5 is O(mk). Before using 
the dynamic spanner of Theorem 5.1, we split each C(G, Fj),j7 € [B] in Line 9. This makes the 


(Spanner) 


max degree of the input graph to each of A; ,j € [B] being O(k). By Lemma 7.5 none of 
these vertices is split in an update to C(G, F}), so we may still apply Theorem 5.1. Thus, the time 
spent on every dynamic spanner is O(mk7,). 


7.1.3 Maintaining a Branching Tree-Chain 


Note that definitions Definitions 7.3 and 7.6 give a way to pass c), w from the top level graph 
G downwards through a tree-chain (Definition 6.10). We omel. this by proving that we can 
dynamically maintain a branching tree-chain (Definition 6.10). 


Lemma 7.9 (Dynamic Branching Tree-Chain). Algorithm 5 takes as input a parameter d, a dy- 
namic graph G®) undergoes t batches of updates U™,...,U™ with gradients g™, length € at 
stage t = 0,...,7 that satisfies the hidden stable-flow chasing property (Definition 6.1) with hidden 
circulation ec, and width w®. The algorithm explicitly o a B- ge tree-chain (Def- 
inition 6.10) with previous rebuild times prev”, cor prev (Definition 6.12). cC wO for 
Ge gi for all0 < i < d are recursively defined via Definition 7.3, 7.6 then P is a tree-chain 
Go, ..., Ga with 


[wS] < Ol) y (Semen n+ ho) for alli € {0,1,...,d}. (32) 


The algorithm succeeds with high probability and runs in total time m40 (yeyr)? (m + Q) for 


Q = X; Enc(U) < poly(n). 

Remark 7.10. Theorem 7.1 maintains the data structure implementing Lemma 7.9 on dynamic 
graphs undergoing only edge insertions/deletions. However, it can be modified to also support vertex 
splits since it is built using Lemmas 6.5 and 7.8 and Theorem 5.1, all which support vertex splits. 


Algorithm 5 initializes a B-branching tree chain as in Definition 6.10. For every graph G € G; for 
some level 7, it maintains a collection of forests, trees, and sparsified core graph using the dynamic 
data structure from Lemma 7.8. 

However, the data structure of Lemma 7.8 can only take up to m/(k log? n) updates if the input 
graph has at most m edges at all time. This forces us to rebuild the data structure every once 
in a while. In particular, we rebuild everything at every level i > io if any of the data structures 
of Lemma 7.8 on some level ig graph G € Gig has accumulated too many updates (approximately 
m/k'°). We will show that the cost for rebuilding amortizes well across dynamic updates. 


Proof of Lemma 7.9. At any level i = 0,...,d, there are at most O(logn)' graphs maintained at 
i-th level at any given stage t due to Lemma 7.8. That is, we have Ig®]| < O(logn} for any t and 
i. At any stage t and level i > 0, every graph G € g% has at most my?—1/k’ vertices and m(ys/k) 
edges. This is again due to Lemma 7.8. 

To analyze the runtime, note that every myt / (aH log? n) updates to some graph G € Gi 
create O(myi7,/k'*!) updates to every S(G, Fj) for j € [B] by Lemma 7.8. Therefore, over the 
course of Q updates to the top level graph G, the total number of updates to the B = O(log n) 
graphs at level is 


Q-0 (yr logn}. 
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Algorithm 5: Dynamically maintains a B-branching tree chain (Definition 6.10). Proce- 
dure INITIALIZE initializes all variables, REBUILD rebuilds the data structure of level at 
least ds at stage to, and DYNAMICBRANCHINGCHAIN takes updates to Gl). 


1 global variables 


2 de log!/ 8n: number of levels in the maintained branching tree chain. 
3 k + m'/4; reduction factor used in Lemma 6.5. 
4 B + O(logn): number of sparsified core graphs maintained in Lemma 7.8. 


5 procedure INITIALIZE(G©), £) 
6 Initialize Go = {Gy 
| REeEBUILD(O, 0) 


8 procedure REBUILD(ig, to) 

9 for i =io,...,d—1do 

10 prev; ; < to. 

11 Gi+ı — {} 

12 for G € G; do 

13 | AlSparseCore) TNITIALIZE(G, £a) 
14 For j € [B], add S(G, F;) to Gi+1- 


15 procedure DYNAMICBRANCHINGCHAIN(G®,U® , g®) a) 


(t) 
16 | Us — UY 
17 for i = 0,...,d — 1 do 
18 if The accumulated encoding size of updates of any G € Gi exceeds 
m(ys/k)'+"/log? n then 
19 | REBUILD(?, t) 
20 for G € G; do 
21 L Te F;) | jE [B]} -] AlsparseCore) DYNAMICSPARSECORE(G, UË) 


Next we analyze the runtime cost of rebuilding level ig. By Lemma 7.8, it takes O(m(ys/k} kyr) 
time to initialize A‘’*) for any graph G € Gi at any level i. Therefore, the cost of rebuilding the 
graphs of every level i > ig is 

om 


d 
> Ollogn) + m(ys/k) "kyr = Fe OCs) kor 


i=io 
However, the rebuild happens at most every m(ys/ k) /log? n total updates to graphs at level ig. 


b U 
Thus, over the course of at most Q - O (7 log? n) : updates to every graph at level zg, the total 
runtime cost spent on rebuilding level io is at most 


(m+ Q) ky OOo). 
The overall runtime bound follows because k = m!/¢, 
We now show (32) by induction on t and the the level i, and prove the result for level i + 1 at 
(t) 
a time t given a partial tree chain Go,...,G; satisfying (32). Let er be the version of graph 
(t) 


Gi when it was rebuilt at time prev; ’. 
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(2) (t) 
at i+ 
Lemma 7.8 guarantees that there is an index j* € [B] which satisfies (30) for all stages in [prev], tl, 
where Gi+1 = S(G;, Fj). By induction and Lemma 7.8, we deduce that 


If prev;’, < t, we can use the same chain Go,...,Gj41 as the change from stage prev;_/, because 


(t) 
G) ~ rev) j (orev ) 
1 <O) wl? £ ) s 


ab |wc: 
1 


1 


Sty So fal) 
j=0 


(prev?) 
w 


(ii) 


(prev) 
w 


1 j=0 


+ O(n) (3: 


eek) 
1 


1 
1 
(prev dy 


(i) is because the vector wC) in Lemma 7.8 corresponds to wl i , as prev”) is the 
initialization time of level i. (ii) is by induction on the stage t and level i. Thus we have shown 
(32) by induction, which completes the proof. 


< O(n) (3: | 
j=0 


prev’)).G 


i 


0) 


7.2 Finding Approximate Min-Ratio Cycles in a Tree-Chain 


In this section we explain how to extract a cycle A from a branching tree-chain (such as the one 
maintained in Lemma 7.9) with large quality |g" A|/||LA||1, satisfying the guarantees of Theo- 
rem 7.1. As a branching tree-chain consists of O(log n)? tree-chains, we focus on getting a cycle A 
out of a single tree-chain. More formally, our setting for much of this section will be a tree-chain 
Go, Gi,...,Gqa (Definition 6.10), with a corresponding tree T = TGo.Ga as defined in Defini- 
tion 6.11. For g, £ and a valid pair c, w (Definition 7.2), we can define c = cand w% = w, and 
c and w% recursively for 1 < i < d via Definitions 7.3 and 7.6. Let €%,g@ be the lengths and 
gradients on the graphs G;, and £C(CiF:) go(Gi-Fi) be the lengths and gradients on the core graphs. 

Note that every edge ef € E(G) \ E(T) has a “lowest” level that the image of it (which we call 
e) exists in a tree chain, after which it is not in the next sparsified core graph. In this case, the 
edge plus its path embedding induce a cycle, which we call the sparsifier cycle associated to e. In 
the below definition we assume that the path embedding of a self-loop e in C(G;, F;) is empty. 


Definition 7.11. Consider a tree-chain Go = G,...,Gq (Definition 6.10) with corresponding tree 
T Œ [Go-Ga where for every0 < i < d, we have a core graph C(G;, Fi) and sparsified core graph 
S(Gi, F;) Cc C(G;, Fi), with embedding e(a;,F))+S(Gi,F): 

We say an edge e@ € E(G) is at level level.c = i if its image e is in E(C(Gi, F;))\E(S(Gi, F;)). 
Define the sparsifier cycle a(e) of such an edge e = eg E€ C(G;, F;) to be the cycle a(e) = eo ® 
rev(Ile(q,,F,)+5(G;,F,) (€0)) = €o D e1 B+» Bez. We define the preimage of this sparsifier cycle in 
G to be the fundamental chain cycle 


aC (ef) = ef STUS, uf] 6 ef o Tuf, us] @--- Bef o T[vZ, uf], 


G ,,G 
iUi 


where e& = (uF, vÊ) is the preimage of edge e; in G for eachi € [L] and where we define ua = uf. 


We let a(e) and af (ef) be the associated flow vectors for the sparsifier cycle a(e) and funda- 
mental chain cycle af (ef). 

At a high level, our algorithm will maintain the total gradient of every fundamental chain cycle 
explicitly. Note that this implies that the gradient of at most m?!) fundamental chain cycles change 
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per iteration on average. Also, the algorithm maintains length overestimates of each fundamental 
chain cycle, as maintaining the true length dynamically is potentially expensive. The algorithm 
will return the overall best quality fundamental chain cycle. 


Definition 7.12. Consider a tree-chain G = Go,...,Gq with corresponding spanning tree T 2 
TOGa, For any edge eC € E(G) \ E(T) at level i with image e in C(G;, F;) \ S(Gi, F;) we define 


~G e e i $ 
len.c, an overestimate on the length of eC ’s fundamental chain cycle, as lenṣc = (EGF), |a(e)]). 


Because the lengths and gradients on all edges in all the G;, and embeddings Ie(G;,F;)>S(Gi, Fi) 
are maintained explicitly in Lemma 7.9, we can store length overestimates lensa for all fundamental 
chain cycles, and their total gradients with a constant overhead. 

There are two more important pieces to check. First, we need to check that the gradients 
defined on the core graphs Definition 6.7 indeed given the correct total gradient for each cycle, and 
that the values len,c are indeed overestimates for the lengths of all the fundamental chain cycles 
a (ef). Then we will show that using the length overestimates len,c still allows us to return a 
sufficiently good fundamental chain cycle. 


Lemma 7.13 (Gradient correctness). Let e ¢ E(T) be an edge with level.c = i and let e be its 
image in G;. Then the total gradient of the cycle a(e) and its preimage aF (e°) are the same, i.e. 


(9 (GHP), ale)) = (g,a%(e%)). 


Lemma 7.14 (Length overestimates). Let e° ¢ E(T) be an edge with levela = i and let e be 


its image in Gi. Then the values lensa overestimate the length of the preimage cycle af (ef) 
len.c 2 (£, laf (ef)|}. 


Le 


It is useful to define the concept of lifting a cycle back from C(G;, F;) to G; in order to show 
Lemmas 7.13 and 7.14. 


Definition 7.15 (Lifted cycle). Consider a cycle C in C(Gi, Fi) with edges €, 6 €2@---@ez, such 
that e; = (ui, vi) is the preimage of & in Gi. We define the lift of C into G; as the cycle 


e1 © Fi [v1, ua] © e2 ©--- Pex © Fi[uz, us]. 


Now we can show Lemmas 7.13 and 7.14 by repeatedly lifting cycles until we are back in the 
top level graph G. 


Proof of Lemma 7.13. It suffices to show that any cycle C in C(G;, Fi) and its lift C in G; have 
the same gradient. Precisely, if we let € and c denote the flow vectors of C and C respectively, 
we wish to show (g@(Gi-¥),@) = (g@,c). To see this, recall that by the definition of g°(CiF) in 
Definition 6.7, for C = â @ --- ® & for ej = (uj,v;) and E(F;) C E(T;) for some tree T; (namely 
the tree used to initialize the forest F; of C(Gi, F;)), 


t L 
O Sg + (9%, p(Fifvj,ujsil) = (9%, 0), 
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where (i) follows because ee} p(Tilv;, u;]) = ee P(T;[v;, uj+1]) (for uz+ı Œ u1) because both 
sides route the same demand on a tree T;. (ii) follows because F; C T, and vj, uj+1 are in the same 
connected component of F;. The last equality follows by the definition of C. 


Proof of Lemma 7.14. Similar to the above proof of Lemma 7.13, by repeatedly lifting until we 
get to G, it suffices to show that the length of a cycle C in C(Gi, Fi) is larger than that of its 
lift C. Formally, if € and c denote the flow vectors of C and C respectively, we wish to show 
(C(GiF) Jel) > (S, jel). For C= â 9- -- OG for ej = (uj, vj) and forest F;, we have 


L w L 
C(GiFi) yy C(G;,F;) (2) i 9G; (ii) F; LCi pGi 
(£ jel) Se do stre ta, > 2 stre, le 
J = —. 
L 
=D eE + (U, [p(Filu;, root®])|) + (L°, [p(Filv;, root®*])|) 
L 
= ÑD eS: + (0%, |p(Fi[v;, root®])|) + (0%, jp Filujs1, root |) 
=I 
L 
> eS! + U, p(Filvj, ural) = E, lel), 


Jo (i) follows from the definition of (CF) in Definition 6.7, (ii) follows from Lemma 6.5 item 
2, and (iii) follows from root; = rootii and the triangle inequality. The final equality is from 


the definition of C as the lift of Ĉ. 


+1 


We now show that it suffices to maintain the “best quality” fundamental chain cycle, i.e. 
MAX Ce R(G)\ A(T) \(g%, a%(e%))|/len.c. To show this, we first explain how to express a cycle c 
as the combination of fundamental chain cycles. 


Lemma 7.16. Given a circulation c in graph G, recursively define cĉ: for alli = 0,...,d via 
Definitions 7.8 and 7.6. Then 


d 
c= 5 5 ciat (ef). 


i=0 eG :level G =i 


Proof. Define y = ZLo Žec level =i c'al (eS). We will show that cec = Yec for any edge e? € 
G\T. 

First, define for any 1 = 0,...,d, Il; as the embedding Ile(G,,F;)—S(G,F;) and et as the image in 
Gi; for any edge e© € G. Clearly, e’ is well-defined if i < level.c. We also denote the image of et in 
the core graph C(Gi, F;) as & 

At ae level i, observe that if et! € Git we have & = etl, TI,(€’) = {e't"} and therefore 
f= = cS . Otherwise, c% Ci is added to c n f € IL(@). Let ef be any edge in G \ T at level i. 


Following from Definition 7.6, we can express eo as 


ii 
Cai = Cea + 5 5 c - TL; (f?)s- (33) 


j=0 fGrlevel pg =J 
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On the other hand, we know that e@ does not appear in any of the sparsifier cycle of fE at 
level j > i. Thus, [af (f®)]ec = 0. If level pa = j < i, [a°(f°%)|.c = —II;(f’)z where the —1 term 
comes from that the sparsifier cycle takes fi and the reverse of the path IL (fi ). This yields that 

i—1 


Gi Gj ay. @ 
Yee = Coi T 5 5 cy I)o = Cee, 


j=0 fGilevel a =j 


where (i) follows by rearranging (33). 
The lemma follows via the fact that a circulation is uniquely determined by the amount of flows 
on non-tree edges. 


Lemma 7.17. Let c,w be a valid pair. Let T = TC? -Ca for a tree-chain Go, ...,Ga. Then 


e |(g, af (ef) )| On. |(g, c)| 
ax — > = 7 a 
eSEB(G\ET) lense O(k) ZLo llw: |j 


Proof. Recall that for an edge € € C(G;, F;) with preimage e in Gj, its length is g = str eG 
defined in Definition 6.7. Thus by the definition of len,c in Definition 7.12, 


d d : 
> DY leFillengs =S7 SY | ste |e] + 2 EZ Jeg" 


i=0 eG :level G =i i=0 eG:level G =i e'€lle(a,,F,)+8(G;,F,) ©) 


(i) 4 z OIE 
< 5 (O(k) wo + wo") < O(k) Y wsha, 
i=0 


i=0 eG: level g=i 


where (i) follows str’ < O(k) from Lemma 6.5 item 2, and the fact that c@,w are all valid 
pairs (see Lemmas 7.4 and 7.7) so €¢%|c¢?| < |w*|, and the definition of wet in Definition 7.6. 
Additionally, note by the triangle inequality and Lemma 7.16 that 


d 
gel sd) SE lelg, af (e°). 


1=0 ef level G =i 
Hence, we get by the fact that 
man = Zien] Fi = 
iein] Yi Lie{n] Yi 
for x,y € Rọ that 


ig. aS (e8))] Eio VeCitevel,g=i leS" || (g, a? (e%))| eC) 
max = > ss 2o : 
eCEE(G)\E(T) len.c Le eG levela =i je” [lense O(k) wo || wo lla 


Remark. We can adapt the statement and proof of Lemma 7.17 to remove the O(k) by being more 
careful. However, this further complicates the statements of Lemma 7.17 and its interaction with 
Section 6, and the extra O(k) does not meaningfully affect our runtimes. 


We can now complete the proof of Theorem 7.1. 
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Proof of Theorem 7.1. The first part is to dynamically maintain an explicit O(logn) branching 
tree chain with path embeddings using Lemma 7.9. With O(1) overhead, the algorithm can also 
maintain the values of (g,a@(e@)), len,c for all fundamental chain cycles of the O(log n)? trees 
in the branching tree chain, because the branching tree chain maintains all edge gradients/lengths 


explicitly, and Definition 7.12 and Lemma 7.13. Hence in O(1) overhead it can maintain the 
(g,.aF(e®))| 
len.¢ 


the best out of these works, note that by Lemma 7.9 with high probability there is a tree-chain 


with 
1 Kso s 1 ie) Kae) 
ry) d : = A N (t) = (t) : 
O(k) Dizo lwli ORON) SE w h Eho lwt h 


for x = 1/(0(k)O(yı)?®) as desired. 

The total runtime is (m + Q)m°? for Q = X tej] Enc(U®) by Lemma 7.9 for the choice 
Ys =y = exp(log?/4 log log m), d = log!/8 m and k = m!/¢, Thus « = exp(—O(log’/® mlog log m)). 
Also, Q approximates }%ej] ju| up to a polylog factor since U“ contains only edge inser- 
tions/deletions. 

Finally, we can rebuild levels i,i + 1,...,d in time m!*+°) /k’ time because the graphs on 
level i have m(ys/k)' edges, there are O(log n)? such graphs, and the initialization time is almost 
linear. 


maximizer arg MaX,c¢¢Z(G)\ K(T) as desired in Lemma 7.17 for each tree. To show that 


8 Rebuilding Data Structure Levels 


The goal of this section is to use Theorem 7.1 to get Theorem 6.2 through a rebuilding game to 
handle the cases where w Prev.) Il1 is much larger than ||w)||; for some 0 < i < d. The surprising 
aspect is that this is doable despite the fact that the w™ are all hidden. We now introduce the 
rebuilding game that captures these notions. Each round of the rebuilding game corresponds to our 
algorithm successfully returning a good enough cycle. When is this does not happen, we instead 
have to rebuild part of our data structure. The rebuilding game is designed to let us formally 
reason about strategies for rebuilding the data structure when it fails to find a good cycle. 


Rebuilding game parameters and definition. The rebuilding game has several parameters: 
integers parameters size m > 0 and depth d > 0, update frequency 0 < yg < 1, a rebuilding cost 


Cr > 1, a weight range K > 1, and a recursive size reduction parameter k 2 mtd > 2, and finally 
an integer round count T > 0. 


Definition 8.1 (Rebuilding Game). The rebuilding game is played between a player and an ad- 
vesary and proceeds in rounds t = 1,2,...,T. Additionally, the steps (moves) taken by the player 
are indexed as s = 1,2,.... Every step s is associated with a tuple prev“) := (prev\*), ig prev\*)) € 
[T\**1. Both the player and adversary know prev'*), At the beginning of the game, at round t = 1 
and step s = 1, we initially set prev” = 1 for all levels i € {0,1,...,d}. 


At the beginning each round t > 1, 


1. The adversary first chooses a positive real weight W® satisfying log W® e (—K,K). This 
weight is hidden from the player. 
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2. Then, while either of the following conditions hold, 


d 
S were) > (d+ 1) Ww (34) 
1=0 


or 


For some level l, at least ygm/k! rounds 
have passed since the last rebuild of level L. (35) 


the adversary can (but does not have to) force the player to perform a fixing step. The player 
may also choose to perform a fixing step, regardless of whether the adversary forces it or not. 
In a fixing step, the player picks a level i € {0,1,...,d}, and we then set previst) + t for 
j € {i i+1,...,d}, and preys) Lo prev"? for j € {0,...,i— 1}. We call this a fix at level 
i, and we say the levels j > i have been rebuilt. This move costs C,-m/k' time. 


3. When the player is no longer performing fixing steps, the round finishes. 


The goal of the player is to complete all T rounds in total time cost O(SK4(m +T)). 


Remark 8.2. We emphasize an important point about our terminology in the rebuilding game: A 


fix 


at level i causes a rebuild of all levels j > i. The adversary can force a rebuild of a level l if it 


has participated in ygm/k! rounds since it was last rebuilt — and the latest rebuild may have been 
triggered by a fix at level l or by a fix at some level i < l. 


To translate the rebuilding game to the setting of Theorems 7.1 and 6.2 we can set W® = 


lj) ||;. We give an algorithm where the player completes all T rounds with total time cost 


O( 


adm + T)). For our choice of parameters, this will be almost linear. Note that it is trivial 


for the player to finish all T rounds in time O(C;mT), as they could just always do a fix at level 


i = 0, as this sets all prev 


(s) 


j t. 


Algorithm 6: Strategy for the rebuilding game. 


1 
2 
3 


onon A 


13 


foreach i = 0,...,d. do 
We maintain a "fixing count", fix;, initialized to zero. 
And we maintain a "round count", round;, also initialized to zero. 


foreach round t = 1,2,...,T of the game do 

if there is a level l with round, > ygm/k! then 

Find the smallest level i such that round; > ygm / kt 

Fix level i, thus rebuilding levels j > i. 

For levels j =7,i+1,...,d set fix; + 0 and round; + 0. 
// We call this a WIN at level i. 


while the adversary continues to force a fixing step do 
Let i be the smallest level in 0,...,d s.t. for all j > i, fix; = 2K 
Fix level 2, thus rebuilding levels j > i. 
Set fix; «+ fix;+1 
// We call this a LOSS at level i. 


For all levels 7 = 0,1,...,d, set round; + round; + 1. 
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The following lemma tells us that a fairly simple strategy can deterministically* ensure that the 
player always wins the rebuilding game. 


Lemma 8.3 (Strategy for Rebuilding Game). There is a deterministic strategy given by Algorithm 6 
for the player to finish T rounds of the rebuilding game in time o(a +T)). 


Before we state the proof, we first introduce some important terminology for understanding 
Algorithm 6 and its analysis. 


The rebuilding game algorithm. Overall, our goal is the following: we (as the player) want to 


ensure that our vector of weights yy (Prev; ™”) at different levels 7 € {0,1,...,d} is such that we must 
frequently succeed in completing a round without making too many fixing steps, and we need to 
ensure we do not spend too much time on fixing steps. To implement our strategy in Algorithm 6, 
we maintain two counters round; and fix; for each level 7. These two counters are used to decide 
which level to rebuild in each step s of the game. 

The first counter, round;, is very simple. It counts the number of rounds that have occurred 
since level 7 was last rebuilt. Ideally, we would like to complete as many rounds as each level can 
handle, before we reset it. The adversary can force a fixing step if it has been more than y,m/ ki 
rounds since level į was last rebuilt. In the setting of Theorem 7.1, this corresponds to a level 
of the branching tree chain accumulating enough updates that it should be rebuilt. We preempt 
the adversary by always rebuilding a level if it has been through this many rounds, regardless of 
whether the adversary forces us to or not. When this occurs, the level can “pay for itself”, since 
the cost of fixing is low when amortized across the rounds since the last rebuild. Thus we declare 
a “WIN” at level į and rebuild levels j > i. 

The second counter, fix;, is the more interesting one. When a fixing step occurs and we decide 
to fix level į (thus rebuiling levels j > i) we say a “LOSS” occurred at level i. The fix; counter 
tracks how many times a fixing step occured and we had a LOSS at level 7, counted since the last 
time we rebuilt level i due to a WIN at some level / < i. In a fixing step, we always decide to let 
the LOSS occur at the largest level index i where fix; < 2K. 


8.1 Analyzing the rebuilding game algorithm 


Before we start our formal proof of Lemma 8.3, we will outline the main the main elements of the 
analyses of the time cost of using Algorithm 6 to play the rebuilding game. 
(s) 
Ideally, we would like to say that “if a LOSS occurs at level i, then the weight W (Prev: ) must 


be large compared to the current round t weight W”, because then rebuilding would reduce 


(s+1) a: . . (s) 
were,” ) However, this is not true, as it may be that some other level’s weight WW‘) ) for 


j < iis large enough to make Equation (34) hold, allowing the adversary to force a fixing step. 
Instead, the invariant we maintain is this: We ensure that when a fixing step occurs, if we choose 
to rebuild level 7, it must be that either. 


Case (A) either some level | < i has an even larger weight than all levels j > i and thus can be 
“blamed” for the fixing step, or 


Case (B) no level j > i has a weight large enough to force a fixing step. 


“Note that when we employ the rebuilding game strategy in our overall data structure, the data structure uses 
randomization. However, the randomized steps succeed with high probability union-bounded across the entire al- 
gorithm. The rebuilding game strategy corresponds to the behavior of the data structure assuming all these data 
structure randomization steps are successful. We address this formally in the proof of Theorem 6.2. 


63 


Handling Case (B). In case (B), we significantly reduce the weight Were," 


w erev‘) at level ¿ for the next step s + 1. Once we have fix; = 2K for all j > i, there must be 
level | < i with weight larger than level i, as repeated occurrence of (B) ensures this. 


) 
) compared to 


Handling Case (A). Now, the remaining key point is to make sure that in case (A), we do not 
waste too much time rebuilding at level 7 before moving to rebuilding at level i — 1, so that we 
eventually start rebuilding the most problematic level l < i with larger weight. Fortunately, our 
threshold of 2K fixes at level 7 is low enough to ensure this. Finally, to help us formalize that 
we make progress on reducing the weight at level į specifically in Case (A), we introduce a notion 
of “prefix maximizing” levels. At any step s, we say a level i is “prefix maximizing” if its weight 
weri?) is strictly larger than the weight y Preg) at all levels j < i This leads to the following 
definition. 


Definition 8.4. In the Rebuilding Game, at the start of each step s, we define a set Z“) of “prefix 
maximizing” levels, given by 


s (s) 
pie fi E {0,1,...,d} | Wem) > WHS) for all j < i}. 


Note that 0 € Z“) for all s, and the player does not know which levels are prefix maximizing. 

This next lemma shows formally that strategy of Algorithm 6 successfully implements the kind 
of weight tracking we described above. Concretely, the lemma tells us that when a level 7 is prefix 
maximizing, the weight of the level must be pushed down as fix; increases. 


Lemma 8.5 (Bound on fixing step count). In the rebuilding game, suppose the player uses the 
strategy of Algorithm 6, then we always have fia) < 2K. 


This lemma tells use that we never have 2K LOSSes at level 0 before the next WIN at level 0. 
Since Algorithm 6 trivially ensures fix; < 2K for all levels j > 0, this gives us tight control over 
the number of fixing steps that can be forced by the adversary. 


Proof of Lemma 8.5. We consider an instance of the rebuilding game and suppose the player uses 
the strategy given by Algorithm 6. To prove our lemma, we first introduce a condition which must 
be satisfied in each step by each level 7 which is prefix maximizing. This condition essentially 


(s) 
states that the fix; counter is correctly tracking an upper bound on the level weight W Prev: ), For 
convenience of our analysis, we define the condition for levels regardless of whether they are prefix 
maximizing, although we only need to show that it holds for such levels. 


Definition. At the start of step s, if for some leveli we have, 
(s) 
logs (wee ) < K — fiz, (fiz, correctness condition) (36) 


we say that level i satisfies the fix; correctness condition at step s. 
Given this notion, we can now state the induction hypothesis. 


Inductive Hypothesis. At the start of step s, Condition (36) holds for each i € I“). 
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We will prove this claim by induction on the step count s. First, we establish that proving this 


Bo i š (s) 
claim is sufficient to prove the lemma. By assumption, we have log, (wee } > —K, and hence 


the above claim would imply fix; < 2K for all i € Z“), for all s. Since 0 € Z“) for all s, we get 
that fixo < 2K always. 

We first establish the base case s = 1. Note that trivially for all ¿ € {0,1,2,...,d}, we have 
fix; = 0, and by assumption we have log, (W Prei) < K = K —fix;. This implies Condition (36) 
holds for every level i, and hence it holds for each level i € Z(°. This establishes the base case. 

Now, we now assume the induction hypothesis at the start of step s and prove it for the start 
of step s +1, i.e. we want to show Condition (36) holds for each i € Z(+!) at the start of step 
s +1. We break the analysis into two main cases, depending on what happens in step s. The first 
case (1) is when a WIN occurs. The second case (2) is when a LOSS occurs at some level 7. We 
further break the second case into two sub-cases, separately handling when (2A) 7 is not a prefix 
maximizing level at step s and when (2B) 7 is a prefix maximizing level at step s. Conceptually, 
the key case is (2B), when we have a LOSS and i € T's), which means we have to ensure that we 
(s)) 


(s+1) 
make progress by reducing W'®%: ) compared to the earlier value W (Previ 


Case 1: a WIN occurs. In this case, a WIN must occur at some level i € {0,1,2,...,d} (since 
Ygm/ k? < 1). Let i denote the level at which the WIN occurs. In this case, for all levels j > i, we 
rebuild and set fix; = 0, and by the definition of K, we thus have (for the updated value of fix;) 


that log, (W rev") < K = K — fix;. Thus, for each j > i, we have that Condition (36) holds, 
(s+1) 


and thus, in particular, it must hold for each j € T : 

For each level | < i, fix; does not change and were; ™) = w Previ”), The latter implies that 
for each 1 < i, 1 € T+) if and only if 1 € Z(°9. We also conclude that Condition (36) holds at 
the beginning of step s+ 1 if it held for level / at the beginning of step s. Hence, by the induction 
hypothesis at step s, Condition (36) holds for all 1 < i with l € T+), 

This proves the induction hypothesis for step s + 1 in the case where a WIN occurs. 


Case 2: a LOSS occurs. We next consider the case when a LOSS occurs in step s, and we 
let the current round be denoted by t. The LOSS occurs at some level 7. In order to analyze this 
case, we are going to split it further into two subcases 2A and 2B, depending on whether the LOSS 
occurs at level ¿ which is in the prefix maximizing set or not (2B and 2A respectively). However, 
first we make some observations that are common to both cases 2A and 2B. 

To start, we deal with levels | < i. As in the case of a WIN, we again have that, for each | < i, 
fix; does not change and were") = werev,”), The latter implies that for each | < i, l € Zt) 
if and only if 1 € Z“). Hence, by the induction hypothesis, Condition (36) holds for each | < i with 
Lerert, 

Next we need to deal with levels j > i. As a LOSS occurs in step s, we must have that 


(s) 
24 weej) > 2(d+1)W. This implies 


nic wi?) > d+ 


nan F w® >2W0®., (37) 


We claim that in this case, we must have 


TO A{i+1,...,d} =. (38) 
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Suppose for a contradiction that for some j > i we have j € Z9). As fix; = 2K, we conclude 
(s) 
by the induction hypothesis, that at the start of step s, we have loga (W Pre"; )) < K-—fix; =—-K, 


(s) (s) 
and hence logy (WP; )) < —K. But, this is impossible, as log, (W Pres )) > —K by the game 
definitions. 


Subcase 2A: LOSS at level i g Z‘). We now further restrict to the case when at the start 
of step s, we have i ¢ Z(°). We thus have at the start of step s, by the condition observed in 
Equation (38), that Z) N {i,i4+1,...,d} =0. 

This allows us to conclude that 


s (s) 
for all j € {i,i+1,...,d} there exists 1 <j with WO”) > pres”), (39) 


(s) (s) 
By Equation (39), we conclude that there exists | < i with Wr) > MAX je {i,i+1,...,d} WEP, 
(s) 
Furthermore, we can conclude that maxpe{o,...i—1} wWwCrev, ) > 2W, since the maximum in Equa- 
tion (37) is not achieved by an index > i. Consequently, when the LOSS at level 7 occurs and we 
rebuild, for all levels j € {i,2+1,...,d}, we set previo) 
we have for all j > i that 


+ t, and hence at the start of step s+ 1, 


(s+1)) 


w_rrev; 


1 = + max | Were?) 
2 he{0,...,i—1} 2 he {0,...,i—1} 


and hence for all j > i we conclude that j ¢ T (s+1) Altogether, this proves the induction hypothesis 
for step s + 1 in the case where a LOSS occurs at level i and i ¢ I“). 


Subcase 2B: LOSS at level i € Z‘*). We now consider the case when at the start of step s, 

we have i € IZ‘). This is the most important case, where we ensure a reduction in the weight 
s+1 s 

were?) compared to werev,”) By Equation (38), we have Z“) N {i+ 1,...,d} = 0, and hence 


(s) (s) 
we conclude that Wei) = maxl_o wes; ) > aw. By the induction hypothesis, we have 


(labelling fix; explicitly by step for clarity) 
logy (WPi) <kKk-—- fix‘) 


hence, 


+1 


loga (W Pre: = 


) = log, (W) < log, (Wires) —1 < K — fix® -1 = K — fixt, (40) 


i 


Thus, Condition (36) holds for i at the end of step s + 1, regardless of whether i € T+), 
(s+1) s 
Furthermore, for all j > i, we set Ws ™ =w = werr = and hence j ¢ Z+), 


Again, recall that we already dealt with established Condition (36) for | < i above in Equa- 
tion (39). Thus the induction hypothesis holds for step s+ 1 when i € Z“), 


This completes our case analysis, establishing the inductive hypothesis, and hence the lemma. 


At this point, armed with the conclusion of Lemma 8.5, we are ready to analyze the running 
time of the rebuilding game strategy given by Algorithm 6, to prove the main lemma of this section, 
Lemma 8.3. 

Before starting the proof, we will briefly outline its main elements. The costs of the rebuilding 
game occur during fixes as part of either a WIN or a LOSS in Line 6 and Line 10 respectively. 
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We use a standard amortization argument to account for the cost of fixes that occur during 
WINs. We can count the cost occurred during WINs separately at each level, and finally add it up 
across these. In each level, the cost per round can be bounded by C, /Yg- 

Next, we have to account for the cost of fixes carried out during LOSSes. These fixes are all 
accounted for by increases in some fix; counter. We then bound the total cost of fixes that later 
have their fix; counter reset by amortizing the cost toward the rounds that cause the reset of the 
fix; counter through a WIN at some level i < j. Because Lemma 8.5 guarantees the fix; counters 
are bounded by 2K, we can bound the additional cost amortized toward the rounds during the 
WIN at level i by ako, per step. Finally the bound on the fix; counters from Lemma 8.5 also 
tells us that the leftover cost unaccounted for by amortization through resets is also bounded, this 
time by 4KC,m in total. 


Proof of Lemma 8.3. To bound the running time, we use a simple amortized analysis across the 
steps of the rebuilding game. 

The round counter at level 7, i.e., round; increases by 1 in each round and hence the sum of the 
increases is T. If a WIN occurs at level 7, we incur a time cost through a fix of level i, with a cost 
of C m/k? (Line 6). At the same time, we reduce the round counter at i and all deeper levels j > i, 
and in particular, we reduce round; by ygm/ kt. We will amortize the cost of this fix toward the 
rounds that increased round; from zero to the threshold ygm/k', and thus the amortized cost from 
fixes during WINs at level į per round is at most C;/7g. When we add this up across T rounds, 
the total cost from fixes during WINs at level i is T - Cr/yg. Thus the total cost added across our 
d+ 1 levels from fixes during WINs is 


cost from fixes during WINs < T(d + 1)C,/%g.- (41) 


All the cost incurred during a LOSS at some level i (Line 10) leads to an increase of the fixing 
step counter fix; by 1, and has an associated time cost of Cpm/k’. We will break the cost form 
LOSSes into two parts: 


1. Cost accounted for by a fix; counter increase where the fix counter is later reset to 0. 
2. Cost accounted for by a fix; counter increase where the fix counter is not reset. 


We can bound the cost arising from Part 2 very easily: By Lemma 8.5, fix; < 2K, and so the 
cost from fixing of level i without a reset of the counter following is bounded by fix; - C,m/ ki< 
2K -C,»m/k*. Adding this cost across all levels we get that the total cost from Part 2 is upper 
bounded by 


d 
cost from fixes during LOSSes with no fix counter reset < 5 2K : Crm/kt <4KC,m. (42) 
i=0 

Finally, we bound the cost from Part 1. Consider the resetting of some counter fix; associated 
with a level j. Any such counter is reset during a WIN at some level 7 < j. We will bound the cost 
part by amortizing it toward the rounds that caused this WIN at level i. In particular, note that 
the level i experienced least ygm/k' rounds since it was last rebuilt and at this point fix; was reset 
(though it may also have been reset again since). This means we can count the cost associated 
with the increases in fix; toward the WIN at level 7. The total cost we need to account for in this 
way toward the WIN at level 7 is then 


d d 
So fix; -Cpm/k? < 2K \°C,m/k) < 4KC,m/k'. 
jai j=i 
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Thus, the amortized cost per round associated with WINs at level i through these fix; resets is at 
most ; 
4KCrm/k'  4KC, 
ygm/ ki Ww 
As we have T rounds, the total cost associated with WINs at level i through fix; resets is then 
S, Since a round can contribute toward a WIN at each of our d+ 1 levels, this means the cost 
amortized to toward a round across all levels is . 


d 
4KC, 4KC,(d+1 
cost from fixes during LOSSes with fix counter reset < 5 T. ‘= r(d+ 1) 
i Yg Yg 
i=0 


T. (43) 


Finally, adding together the costs accounted for in Equations (41), (42), and (43), and the cost 
of executing rounds, we get a bound on the total cost of aT + m)) as desired. 


8.2 Dynamic Min-Ratio Cycle Using the Rebuilding Game 


In this section we combine Theorem 7.1 and Lemma 8.3 to show Theorem 6.2 which gives a data 
structure for returning min-ratio cycles in dynamic graphs with hidden stable-flow chasing updates. 


def 


Proof of Theorem 6.2. Let W & Iau |] 1. The adversary plays the following strategy. They feed 
the inputs g® , 2, U to Theorem 7.1 and get a cycle A. Let k!) be the approximate parameter 


def 


from Theorem 7.1. Let k = «(7 /(2d + 2). The adversary checks whether 
(g, A)/|20 o Alli < Ka. (44) 


Because A is represented using m°) edges on a tree T, this can be performed in amortized mW) 
time by using a dynamic tree (Lemma 3.3). If (44) holds, the adversary allow a progress step, and 
this completes the QUERY() operation of Theorem 6.2. Otherwise, they force the player to perform 
a fixing step. This is valid because by Theorem 7.1 we must have 


OH yO (t) 
KTD (gc) |. (9, A) 
2d Pw], eO 


t t 
< KD. (gl ) el ol , 
Ali Eio lwe] 


so Lo w r a > (d+ 1)||w®]|ı. Our algorithm for Theorem 6.2 is then to implement the 
player’s strategy in Lemma 8.3 on top of Theorem 7.1. 

Correctness of QUERY() follows by definition. To bound the runtime, by Theorem 7.1 we 
can take the constants d = log'/®m, k = exp(O(log’/* m)), Yg = exp(—O/(log’/* m log log m)), 
C, = exp(O(log’/® m log log m)), T = (m + Q) exp(O(log’/® m log log m)). Thus by Lemma 8.3 the 
total runtime to execute the player’s algorithm is (m+ Q) exp(O(log’/*® m log log m)) as desired. 


9 Computing the Min-Cost Flow via Min-Ratio Cycles 


In this section we given the full pseudocode for proving Theorem 1.1, modulo getting an initial 
point and final point, which are explained in Lemmas 4.11 and 4.12. 

We explain the implementation in procedure MINCOSTFLOW given in Algorithm 7. The al- 
gorithm maintains approximate lengths € and gradients g“, updating them when the dynamic 
tree data structures D(T:) report that some edge has accumulated many changes. It updates these 
lengths and gradients, and passes the result to a data structure D(#SF°) which dynamically main- 
tains the trees 7),...,7 and a min-ratio cycle on them under hidden stable-flow chasing updates. 
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Algorithm 7: MinCostFLow(G,d,c,ut,u-, f,F*). Takes graph G, demands d, 
costs c, upper/lower capacities ut, u`, initial feasible flow f (0) (Lemma 4.12), and guess 
of the optimal flow F* 


1 global variables 


a PF © N 


oma N DD 


10 


11 
12 
13 
14 
15 
16 
17 
18 
19 


20 
21 
22 


23 


24 
25 
26 


27 


28 
29 


30 


a + 1/(1000 log mU) 


k < exp(—O(log’/® m log log m)) // Approximation quality in Theorem 6.2 
d + O(log!/® n) // Data structure depth 
DHSFC) // Hidden Stable-Flow Chasing (HSFC) data structure in 
Theorem 6.2 

T,,T2,...,T; for s+ O(logn)? // Trees maintained by data structure D(HSFC) 
e€ + Ka/(1000s). // Error tolerated within each tree. 
DT) // Dynamic tree data structure for trees T; 
fi,....fs <0 € R® and f = fO + Viels) fi // Flows on trees T; 
fË // Approximate flow at stage t, remembers which edges have been 
updated. 

fO f% // Total flow at stage t, implicitly stored 
r 4+— oœ // Estimate of cost difference from optimal. 


procedure MinCostFLow(G,d,c,ut,u-, f, F*) 


while c! f — F* > (mU)! do 

if t is a multiple of |em| then 

Explicitly compute f © f + Diels] fio FO - fo. 

re cT fl) —F*, // Cost difference from optimal. 

g® — g(f), 20 & UFO) // Definition 4.2 

Rebuild D(/#5¥°) and update the T;. // Because r may have changed by a 
l+e factor. 


UO © Usey DP .DETECT() // Lemma 3.3 
foreach e€ U do 
Sot FP + fe” = fo) + Licig (Fides E UFO) // Definition 4.2 
gf? — 20mee/r-+a(uz — FO) — af! uz) 
/* No change to e ¢ U® a 
foreach e ¢ U® do g? i gi) e% E ee), FO a KD, 
DHSFC) UppatE(UM, g, 2), and update the T; for i € [s] // Theorem 6.2 


(i, A) + D@SFC) Query (), where i € [s] and 

A = (u1, 01) $ T;[v1, u2] © (ua, v2) @ -- - ® (ui, v1) @ Tilu, u1] for edges (u;,v;) and 

l < mo), // A represented via m°!) off-tree edges and paths on T; 
A + nA for n + —K2a?/(800(g, A)) // Scale A so (g®, A) = —K?a?/800 
fii fit A using D™), Lemma 3.3 item 3 // Implicitly set fF Ej fE) +A 
tct+l. 


L jart fo 
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We check that the updates to £, g) are indeed hidden stable-flow chasing in Lemma 9.2. Finally, 
the D#5FC) rebuilds itself every em iterations, after which the residual cost c! f — F* might have 
changed by a 1+ ¢ factor. It terminates when c! f — F* < (mU)~!°, which happens within m!+0() 
iterations. 

To analyze the progress of the algorithm, we will show that MINCOSTFLOw (Algorithm 7) 
satisfies the hypotheses of our main IPM result Theorem 4.3. Thus, applying Theorem 4.3 shows 
that MinCosTFLow (Algorithm 7) computes a mincost flow to high accuracy in O(m«~?) iterations 
for some «K = m7), 

We first note that 2% and g™ are approximately correct lengths and gradients at all times. 


Lemma 9.1 (Stability in MINCosTFLOW). During a call to MINCOSTFLOW (Algorithm 7), for 
FO a FL Diels] Si, we have LO x11 (fF), forr defined in line 17, r X14. c! fF — F*, and 


[LOTH = (eT 4O - F*)/r-g(F))||_ < 10se = a /100. 


Proof. To show £® 4 ef) it suffices to check that f +- fo and f + f satisfy the 
hypotheses of Lemma 4.9, precisely ||L(f)(f — f)||,o < se. Indeed, this follows directly by 
the guarantees of DETECT in Lemma 3.3 and the fact there are s trees, because if no tree returned 
e, then the total error is at most se. 

To show the bound on the gradient, we use Lemma 4.10. Because we have argue above that 
Lf) AF = fF) oo < se, it suffices to check that r&14-c' f — F*. Recall that r is reset 
every |em] iterations in line 17 of Algorithm 7. For the scaled circulation A in line 27, Lemma 4.8, 


for g = g® and l= 2 in Algorithm 7, tells us 
Ne Al gg ayem) < a2x/(800m) < 1/(800m) 
cl fO- FT = = 


where the hypotheses of Lemma 4.8 are satisfied because of the guarantee of D(/“5"°) Qurry() 
(Theorem 6.2), and we used the bound on |g“ A] from line 27 of Algorithm 7. Hence over em 
iterations, c! f — F* can change by at most a (1 + 1/(800m))®™ < 1 + factor, as desired. 


Our next goal is to define circulations ec and upper bounds wC) to make ge ce, w® 
as defined in MINCOSTFLOW (Algorithm 7) satisfy the hidden stable-flow chasing property. This 
shows that the solutions A returned by the data structure have a good ratio. 


Lemma 9.2. Let g,@,U be defined as in an execution of MiInCosTFLOW (Algorithm 7). 
For f* © arg mingtfoae'f, let cH E f*— f and w® = 50 + [LO oc. Then g®, 20,0 
satisfy the hidden stable-flow chasing (Definition 6.1) with circulations c®) and upper bounds w). 


Proof. We check each item of Definition 6.1 carefully. For the circulation condition in item 1, 
note that B'c = BT f*—B' f® = d—d=0 because f* and f™ both route the demand d. 
For the width condition in item 2, by the definition of w = 50+ je) o c)| we trivially have 
2 oc] < w coordinate-wise. 

To check that the upper bounds w) are stable (item 3), for an edge e let t' € [last™), t] be so 
that e was not updated by any UČ® since stage t'. By the guarantees of DETECT we know that 


| FO — | Se. 


Hence we get that 


po? — awl!) È LOE = FOI < e È 1/10010, 
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(t) plast?) , t) a (t) 
Here, (i) follows because lè” = lè because e was not updated in any U\” since stage last; 


and (ii) is because wÉ?) > 50 for all t. Hence lw”) < LAjwl?| as desired. 
To check that the lengths and widths are quasipolynomially bounded for item 4, note that 


min{ue — fe, fe — u7 } > (fO) > exp(—O(log? mU)), 
by our assumption that 6(f) < O(m) always. Also, je < O(U) for all e € E. This shows that 


log e Jog wo) < O(log? mU) for all e € E. The lower bound wi) > 50 is by definition. Also, the 
lower bound eo) > 1/U'** > 1/U? is trivial by the definition of eo. 


As a result, we deduce that D(#5¥©) succeeds whp. This allows us to prove that MINCosTFLOW 
satisfies the hypotheses of Algorithm 7, and allows us to bound the total number of iterations. 


Lemma 9.3. An execution of MINCosTFLOw (Algorithm 7) runs for O(mk~2a7?) iterations. 
Proof. We will define g, l, A,n and flows fË to show that an execution of MINCoOsTFLOW (Algo- 
rithm 7) satisfies the hypotheses of Theorem 4.3, which implies that MINCOSTFLOw (Algorithm 7) 
terminates in O(mk~2a7~?) iterations. 

For f as defined in MINCosTFLOw (Algorithm 7), note that ®(f) < 200mlog(mU) at all 
times. This is because it holds at the initial point f (Lemma 4.12) and the potential is decreasing. 

Next we define g, £, the approximate gradients and lengths. Let g = r/(e' f® — F*)-g and 
l = e for g®,e as defined in MINCosTFLOW (Algorithm 7). By Lemma 9.1 we know that 
£ x1 (Ff) and 


LPO) E — gO) = r/e FO — F*) 


< (1+ ¢)aK/100 < ak /50, 


ECF) O = (eT fÀ — F*)/r- 9 F))| 


(oe) 


— 
sa 


where (i) uses the bounds on r and g®) in Lemma 9.1. Thus for c® = f* — f as in Lemma 9.2, 


T-O) © 


— (e fË _ F*)/r. = 
= (e'f F*)/r 50m 4 |LOcO < -(1-e)a/4 < -a/8, 


where (i) follows from the first item of Theorem 4.3 (Lemma 4.7) and r ~14e cC! f — F* from 
Lemma 9.1. Hence, by the guarantees of D(#5¥°) Query() (Theorem 6.2) as called in line 26 of 
MINCOSTF Low (Algorithm 7), we know that whp. 


OTA 
g 


Thus, for the scaling 7 as in MINCOSTFLOW (Algorithm 7), Theorem 4.3 (where we change «x to 
ak/8 for this setting) shows that the algorithm computes a high-accuracy flow in O(mr«~?a7?) 


iterations. 


The final piece is to analyze the runtime of MINCOSTFLOW (Algorithm 7) by bounding the 
total size of the update batches U“) as defined in line 20. 


Lemma 9.4. Consider a call to MINCosTFLOW (Algorithm 7) and let U be as in line 20. Then 
5S jU| < O(mK~2a7~2e71) < mite), 
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Proof. Because of the guarantee in line 27 of Algorithm 7, we know that |g“ 'A| = K?a?/800. 
Additionally by the guarantees of D SFC) QueEry() (Theorem 6.2) as called in line 26 of MIN- 
CosTFLow (Algorithm 7), we get that (see (45)) 


IL® AJh < 8/(Ka)|gOTA] <1. 
Hence the sum of ||LØA]|ı over all iterations is O(m«~2a~2) by our bound on the number of 
iterations in Lemma 9.3. Each time an update on edge e in U W it contributes Q(e) to this sum by 
the guarantees of DETECT in Lemma 3.3. Hence X; |U| < O(mn~?a7e71), 


Combining these pieces shows our main result Theorem 1.1 on computing min-cost flows. 


Proof of Theorem 1.1. Given a min-cost flow instance, we will first use Lemmas 4.11 and 4.12 
to compute the initial flow f. We then run MINCosTFLOw with initial flow f, and use 
Lemma 4.11 to round to an exact min-cost flow. 

The only remaining piece to analyze is the runtime. The main component of the runtime is the 
data structure D(#5FC) (Theorem 6.2). The inputs g,L0,U to DESFC) satisfy the hidden 
stable-flow chasing property by Lemma 9.2. Hence the data structure D#"©) runs in total time 
e (m+ Q)m°) = m!+e time by Theorem 6.2, because the data structure reinitializes O(e~!) 
times (in line 15), and Q = >, |U| < m!+e by Lemma 9.4. 

The remaining runtime components can be handled in sm?) = m° time per operation by 
using dynamic trees (Lemma 3.3), as there are s trees, and the fact that the cycle in line 26 of 
MINCOSTFLOw (Algorithm 7) is represented by exp(O(log’/® mloglogm)) < m°) paths, so the 
total runtime is m!+°) as desired. 


10 General Convex Objectives 


The goal of this section is to extend our algorithms to the setting of optimizing single commodity 
flows for general decomposable convex objectives. 


10.1 General Setup for General Convex Objectives 


Formally, for a graph G = (V, E) let he : R > RU {+00} be convex functions. For a flow f 
let h(f) = een helfe). Our goal is to minimize h(f) over all flows f routing a demand d, i.e. 
Bf =d. 

We cast this in the setting of empirical risk minimization (see [LSZ19]) by introducing new 
variables y € RË and convex sets Xe = f(y Sy < helf): 


min h = min 1ly= min l'y. 46 
B! f=d (P) B! f=d y B! f=d a eo 
yER” :ye<h(fe) for all e€ E (fe,Ye)EXe for all e€ E 


Let F* = mingt oa h(f). We will assume that we have access to gradients and Hessians of v-self- 
concordant barriers for Ve, We : Xe —> R. Explicit self-concordant barriers are known for several 
natural objectives he (see e.g. Chapter 9.6. of [BV04], or Section 4 of [Nes98]), and it is known 
that every subset X C R” admits an n-self-concordant barrier [Nes98; Nes04; Che21; LY21]. 

We now formally introduce the definition of self-concordance. 
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Definition 10.1 (v-self-concordance [Nes04, Definition 4.2.2]). We say that a function: X —> R 
on an open set X C R” is a self-concordant barrier if y is convex, y(x) —> œ as x approaches the 
boundary of X, and for alla E€ X and v € R” 


\VFu(a) [v, v, v| <2 (vV? (x) v) on 


We say that f is v-self-concordant for some v > 0 if f is self-concordant and for all x € X and 
v E€ R” we have 
(Vy (æ) ,v)? < vo Vu (a) v. 


Analyzing the runtime of our algorithm requires assuming that various quantities are quasipoly- 
nomially bounded such as the starting flow, demands, and convex objectives, and the underlying 
self-concordant barriers. 


Assumption 10.2. We make the following assumptions for our method, for a parameter K = O(1). 
1. We have access in O(1) time to gradients/Hessians of the self-concordant barriers pel fe, Ye). 


2. All capacities, demands, and costs are polynomially bounded, i.e. |fe| < m” for alle, ||dlloo < 
m*, and |he(x)| < O(m* + |z|*) for all z € R. 


3. We shift the barriers U( fe, ye) such that inf f jyejcmx U( fe, Ye) = 0. We can shift the barriers 
because that does not affect self-concordance. This implies that ¢.(fe) > 1 on the whole 


domain. 


4. There is a feasible flow f and variables by) <m* such that mI < Vye fo, y(0).) < 
mT for alle, and bel fo”, y®) <K. 


5. The parameters a,é,« used throughout are all less than 1/(1000v). 


6. The Hessian is quasipolynomially bounded as long as the function value is O(1) bounded, i.e. 
for all points | fel, Yel < m* with Wel fe Ye) < O(1), we have Vi Yel fe, Ye) x exp(log?® m)I. 


We assume everything stated above for the remainder of the section. The final assumption 
in item 6 is to ensure that all lengths/gradients encountered in the algorithm are bounded by 
exp(log?® m). This holds for all explicit O(1)-self-concordant barriers we have encountered, such 
as those for entropy-regularized optimal transport, matrix scaling, and normed flows. 

We make direct use of the following lemmas from [Nes04]. 


Lemma 10.3 ({Nes04, Theorem 4.2.4], first part). For a self concordant function f, and any x 
and y in its domain, we have 


(VF (æ), y — x) < v. 


Lemma 10.4 ([Nes04, Theorem 4.1.7], first part). For a self concordant function f, and any x 
and y in its domain, we have 


læ- ylz pa) 
VF (y) -Vf (z), y- z) 2 , 
IAW 1+ |æ — ylly2 pa) 
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Lemma 10.5 ({Nes04, Theorem 4.1.6]). For a self concordant function f, and any x and y in its 
domain such that 


læ -— ylly2p(a) < 1 
we have 


1 
(1- lle —ullv2ya) 


Fix some a € (0, 1/10), set a path parameter t and minimize the following objective 


Vi(f,y) =g 1'y + 5 exp(aYe( fe, Ye)) = 5 (tye + exp(ave( fe, Ye))), 


ecE ecE 


V’ f (a) 


(1- læ- yllyzfæ) V? (@) x V?F (y) = 


over B! f = d. This is analogous to our a-power potential in Equation 9 at the start of Section 4. 
Note that for a fixed flow f, we can eliminate the variables y in the following way. We should 
set Ye = Yel fe) for yel fe) = arg min, ty + exp(ave(fe, y)). Thus we can write 


mii yh y) = min nd tye (fe) + exp(ave( fe, Yel fe)))) 5 


B' f=d B' f=d eer 
Let C.(fe) = exp(ate(fe, y(fe))), and Cel fe) = tyel fe) + Cel fe) and define the potential 
ZF) = YE GA (47) 
ecE 


Our first main lemma (Lemma 10.8) will be that up to scaling, the function ¢ is self-concordant. 
To show this, we start by studying the derivatives of the function tye( fe) + Ce(fe). 


Definition 10.6. For a function Y : R” > R, and a sequence (i1,i2,...,in) € [n]*, define the 
mixed partials 
j _ oO o 
Lijsen Lip — Ax; eae x, 


Lemma 10.7. Let f : X — R be a convex function on an open set X C R?. For x € R let 


y(x) = argmin,~(x,y). Let C(x) € (a, y(x)). Then for v = Fe ; 
C(x) = (Vo (x, y(2)), v) = falz, y(2)), (48) 
c" (x) = v' V7o(z, y(x))e, (49) 
C" (x) = V*y(z, y(z))[v, v, o]. (50) 
y (x) = —Yry(£, oem y(z)). (51) 


Proof. Note that Yy (x, y(x)) = 0 by the optimality of y(x). By the chain rule for total derivatives 


C(x) = falx, y(x)) + Yy(x, y(x))y' (z) = (VYlz, y(2)), v) 


which shows the first equality (48). 
Taking the derivative of the first equality of (48) gives us 


C” (2) = Paz(T, Y(2)) + pry (2, YL) )y' (2) + Yyy(a, y(a))y (2)? + fy(a, y(x) Jy” (2) 
= Urol, y(x)) + Whey(a, y(x))y'(@) + Yyy(a, y(2))y (2)? = v' V’y(z, y(2))v, 
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where we have used that q(x, y(a)) = 0. This shows (49). 
Taking the derivative of (49) gives 


C(t) = Vree(2,y(2)) + 3bray(x, y())y'(2) + 3b ayy(x, y(x))y! (2)? + Vyyy (2, y(2))y'(x)” 
+ 2 (Way(x, y(@)) + Pyy(a, y(a))y'(@)) y” (2). 


However, note that taking the derivative of the identity y(x, y(x)) = 0 gives us 
Way (x, y (x)) + Yyy (x, yY (x)) y' (x) — 0. 


Plugging this into the above gives us 


C” (2) = Deae (2, y(@)) + 3h rey(2, y(a))y'() + Shay (x, y(@))y (1) + Yyyy (2, y(a))y’ (2) 
= Vu(z, y(z)) [v, V, v] 


as desired. 
To show (51), recall that fy(x,y(x)) = 0. Taking a derivative of this in x gives 


Dry(, u(x) + dyy(x, y(x) )y" (£) = 0, 


which rearranges to (51) as desired. 


Now we show that the Çe functions are self-concordant. Note that we do not claim that ¢ is 
v-self-concordant, just self-concordant. 


Lemma 10.8. For alle € E, a~'¢./4 is a self-concordant function. 
Proof. We calculate 
a" Ce" (fe) = a V*(exp(arpe( fe, Ye(Fe)))) fv, v, vl 
= (Vo de( Fe ve(Fe))[v, 0,0] 
+ 3aVthe( fe, Yel fe) lv, V] + a? (Vibel Fe, vel Fe)),¥)*)Ce(Fe); 


1 


| . By v-self-concordance of pe, we can bound the previous expression by 


aC (fe) SAV el Fer ve(Fe))[0, Ue(Fe))?? < 4a Ee)’, 


where the last inequality follows by the formula for (fe). Scaling by a factor of 4 completes the 
proof. 


Our algorithm for solving (46) will fix a value of t and reduce the value of the potential (47) 
until 11y < F* + 50vm/t. Once this holds, we will double the value of t and start a new phase. 
Each phase will requires approximately m!+°™ q@~? iterations. 

We now formally define the gradients and lengths. The gradient is 


Jlf )e = [V ZP) = Gl fe), (52) 
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and the lengths we define as 
UTi = ye Os) 
= V (oT V2tbel Fes yelfe))v + Vel fe, Ye(fe)),¥)*)Ce(fe) for v = F a l (53) 


Here, the equality starting line 2 follows from Lemma 10.7, (49) applied to the function ¢.( fe). 
Define f/ arg mingT fea Z1(f). We now bound the optimality gap of f;*. This will ultimately 
—o(1) 


show that if Z,(f/*) — F* is much larger than 4m, then we can reduce the potential by m ina 


single step. 
Lemma 10.9 (Optimality gap). For sufficiently small a = Q(1/log max(t,2)), we have that 
Zi fE) — F* < 4m. 


Proof. Recall that f,y© are the initially feasible points. Let f* be the optimal flow and y = 
he(f*). We will upper bound Z;(f) for a flow f = 8f® + (1 -— 8) f* and y = By + (1 — B)y* 
for a parameter 8 € [0,1] chosen later. Define Q = h(f)) — F*, the optimality gap of the original 
flow. By our assumptions, we know that log Q = O(1). We set 8 = min(1,m/(tQ)). For s € [0,1] 
let fO E fO + s(f* — fF) and y E yO + s(y* — yO). 
Define the function Ye(s) = Yel fo, y), which is v-self-concordant as it is the restriction of 
Sl 


We onto a line. By self-concordance, Y (8) > pe (s)/(2 Y (s)), so integrating both sides gives 


Fal — B) — Be(0) > VEC — A) - V0). 
By v-self-concordance and [Nes04, Theorem 4.2.4] (Lemma 10.3), we know 6¢,(1 — 8) < v, and 


1 


pe(0) > —y vý. (0). Rearranging this gives us 


TLA — B) < 2V v (0) + v/B. (54) 


We use this to bound w.(fe, ye) = Yell — 8). We can assume that w.,(0) > 0 because w.(s) is an 
increasing function, so we might as well start our integration at the minimizer on the line. 
Now, by rearranging the v-self-concordance condition we get 


SN" 
Oreo eee 
p(s) +1 
Integrating both sides gives us 
. z p(1-6)+1 


Ball 6) -70 < voe ( ) +1 <vio( vibq(1 = 8) +1) +1. 


$(0) +1 


Recall by our assumption that w,(0) = Pel FO, y®) <K= O(1). Additionally by (54) the RHS of 


the above expression is also bounded by O(1) + log(v/8) < O(1) +max(0, O(log t)) because log Q = 
O(1). Thus, we get that %.(1—8) = O(1)+max(0, O(log t)), and in turn for a = Q(1/ log max(t, 2)), 


Zf) <t. lly F 5 exp(awe( fe, Ye)) < tBQ + 2m < 4m. 
ecE 
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Using this, we will bound the quality of the solution f/ — f, i.e. how negative its gradient is 
compared to its total length. 


Lemma 10.10. Let a be set as in Lemma 10.9. If1'y(f)—F* > 10m/t and \|L(f)~!(g—g(f)|loo < 
e for e < a/100 then 
F'(FE-— F) < —a/4- (ECFE — f)\la — m/4. 


Proof. We first handle the case where a||L(f)(f/ — f)llı < 10m. In this case, by the convexity of 
Zi(f), we get 


ICF) (Fi — F) < ZAFE) — ZF) = (ACF) — F") — (Ze f) — F*) 


which upon applying Lemma 10.9 to the first term, and the assumption of 1' y(f) — F* > 10m/t 
to the second gives 
< 4m —t-10m/t < —6m. 


Thus, we get 


ICF) (Ff — F) < IP) (FE - P) + ILA) TG — Ilol EE — Alli 
<—6m+e-10m/a < —5m < —a/4- ||L(f)(F* — Allı — m/4. 


Now, we handle the case where a||L(f)(f; — f)||1 > 10m. Consider the function 


Cel fe) = atel fe)/4, 


which by Lemma 10.8 is self-concordant. Invoking [Nes04, Theorem 4.1.7] (Lemma 10.4) on this 
function, we get for all e, 


Co fe) "| [féle a 
l+ Cel fe) [File 


>  G(Fe)"Ilfle— fel = 1. 


V 


CFE) — FAS e — fe) = 


V 


Rearranging the above equation gives us 


UFE] = Fe) < 40 (QURI — o = VEGAS- fel + 1) 
= 4d ([Fi]e)(Uffle — fe) = 0/2- USAF — fel + 40 


By the optimality of fž we know that g(fž) = Bz for some z, so the first term is 0 because the 
difference fř — f is a circulation. Hence 


oF) "(Ft — fe) < —a/2- ILA) (FE — F)ll + 4am. 
which upon incorporating errors from the approximate gradient gives 


J (ft — F) < —a/2- LAF — fylla + 4am + EATE = PILAE — 9(F)) loo 
< (-a/2 + €) -L(A — fll + 4am < -a/4- [L(A — Pll — m/4 


as long as a||L(f)(f — f)||1 > 10m, for the choice of a. 


We move towards analyzing how a step A decreases the potential Z;. We start by showing that 
the gradients and lengths are stable in a Hessian ball. 
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Lemma 10.11 (Stability bounds). For a flow f and vector f € RË satisfying ||L(f)(f —Ff)lloo < € 
for e < 1/1000, then &(f) 1452 €(F), IELAI) — 9(F))lloo < £. 


Proof. The stability of lengths follows from self-concordance of a~!¢,/4 shown in Lemma 10.8, plus 
the Hessian stability of such functions shown in [Nes98, Theorem 4.1.6] (Lemma 10.5). To analyze 
gradient stability, let d(s) = ¢)(f)). Now 


ICs) | =|Fe — FAC (F) = al Fe — Fel(F)? < 2al(fe)"|Fe — fel < 2aeke(f). 


Hence |£.(f)~1(¢(1) — @(0))| < 2ae < £ as desired. 


We can use Lemma 10.11 to show that a good quality circlation A decreases the potential Z+. 


Lemma 10.12. Let L œ L(f) and |LIA g — g(f))\loo < € fore < K/100. If circulation A 
satisfies J| A/|\|LA||1 < —k, then for n > 0 chosen so that ng! A = —K?/50 satisfies 


Zi(f +n&) < Zi(f) — «7/100. 
Proof. Let A = nA. Define f) = f + sA, and ¢(s) = Z,(f©). By Taylor’s theorem, we know 


Z(f +n) — Z:(f) = 90) — o(0) < (0) + max $"(s)?/2 


s€[0,1] 
S ng(f) A+ V2 (HA 
<ng' A+ (9(f)-g)' A+ allL AAI 


< =k? /50 + L(A ICF) -Dll LAA + LP) AM; 
< —K7/50 + eK /50 + («/50)? < —K? /100, 


A 


where (7) follows from length stability in Lemma 10.11. 


We can now state and show our main result on optimizing flows under general convex objectives. 


Theorem 10.13 (General convex flows). Let G be a graph with m edges, and let d be a demand. 
Given convex functions he : R + RU {+20} and v-self-concordant barriers pelf, y) on the domain 
{(f,y): y < hel f)} satisfying the guarantees of Assumption 10.2, there is an algorithm that runs 
in m+) time and outputs a flow f with BT f = d and for any fixed constant C > 0, 
h(f)< min h(f*) + exp(—log® m). 
d 


= BT f*= 


Proof. Initialize t = mo), and set a = Q(1) as in Lemma 10.9. For this fixed value of t run 
the analogue of Algorithm 7, and we repeat the same analysis as in Section 9. We will store the 


approximate values of f,y(f). Every Q(m) iterations, we recompute f,y(f) exactly and check 
whether 1! y(f) — F* < 20m/t. If so, we double t and proceed to the next phase. We stop when 
t =m), so there are at most O(1) phases. 

By Lemmas 10.10 and 10.12, the value of Z; decreases by «~2a72 = m7) per iteration. 
When t doubles, because we know that 1! y(f) — F* < 20m/t by the stopping condition, Z;(f) < 
20m + Z:(f), ie. the potential increases by at most 20m. Hence over all O(1) phases, the total 
potential increase is O(m). So the algorithm runs in at most m!+°) iterations. The number of 
gradient /length changes is bounded by m!+e()) if they are updated lazily by Lemma 10.11. 
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Because Z;(f) < O(m) always, by the choice of a we know that We(fe, Ye) < O(1) at all 
times. Thus, by item 6 of Assumption 10.2, all lengths are quasipolynomially bounded during the 


def 


algorithm. Additionally, Lemma 10.10 and an identical analysis to Lemma 9.2 for c = fř — f and 
w = 50+ ||€(f) o cl|ı shows that the updates to g, £ satisfy the hidden stable-flow chasing property 
(Definition 6.1). Hence our min-ratio cycle data structure Theorem 6.2 succeeds whp. in total time 


m'+o(1) as desired. 


10.2 Applications: p-Norms, Entropy-Regularized Optimal Transport, and Ma- 
trix Scaling 


Using our main result Theorem 10.13 we can give algorithms for the problems of normed flows, iso- 
tonic regression, entropy-regularized optimal transport, and matrix scaling. We start by discussing 
p-norm flows. In this case, we allow the convex functions on our edges to be the sum of arbitrarily 
weighted power functions where the power is at most O(1). 


Theorem 10.14. Consider a graph G = (V,E) and demand d whose entries are bounded by 
exp(log? m), and convex functions he which are the sum of O(1) p-norm terms, i.e. he(x) = 
yi, will’ for ce < O(1) and pi < O(1), and wi € [0,exp(log?™ m)] for all i € [ce]. Let 
h(f) = Dece helfe). Then in m+) time we can compute a flow f satisfying B' f = d and for 
any constant C > 0 

h(f)< min PF) + exp(— log? m). 


= B! f*= 


Proof. By splitting up an edge e into ce edges in a path we can assume that each he(x) = w|a|? 
for some p < O(1). It is known that the function y(x, y) = _2logy — log(y?/? — x?) is 4-self- 
concordant for the region {(x, y) : y > |x|P} [Nem04, Example 9.2.1]. For this barrier, all items in 


Assumption 10.2 hold by observation, except those involving V22(a, y) which we now calculate. 


oie na + = 

2 — | (4⁄7 P- y“ P—x p(y“ P—=x 

V p(x, y) a _ Axy?/P—1 2 + Ay*/P—2 (2p—4)y2/P-2 
p(y2/P —a2?)2 Y | p?(y?/P—2?)? "py? (y?/P—2?) 


Clearly, if — logy, —log(y?/? — 2?) < O(1), and |y], |z| < ml), then all terms of V7w(2,y) are 
bounded by exp(log? m) as desired, which verifies item 6 of Assumption 10.2. Thus all assump- 
tions are satisfied, so the result follows by Theorem 10.13. 


The same barriers allow us to solve the problem of lp isotonic regression [KRS15]. Given a 
directed acyclic graph G = (V, E) and a vector y € RY, the Lp isotonic regression problems asks to 
return a vector æ that satisfies £y < x, for all directed arcs (u,v) € E, and minimizes |W (æ — y) ||p 
for a weight vector W > 0. Linear algebraically, this is min perv Besë ||W(@ — y)||p. We show that 
the dual of this problem is a flow problem. Let q be the dual norm of p. Then 


min _||W(z-y)|p= min max z'W(x—y)-f'Ba 
2cRY Ba>d @ERV f>0,||2||q<1 


max min z'W(æ-— y) -— f' Be 
F20,||2||q<1 2ER” 


— max —f' (By). 
f20,|W-'BT fllq<1 


Let c= By. By rescaling, the objective becomes computing 


mine! f + |w'B! f||2. 
f20 


79 


Given a high-accuracy solution to this objective, we can extract the desired original potentials æ 
by taking a gradient of the objective. To turn this objective into the q-norm of a flow, add a few 
vertex v* to the graph G, and an undirected edge between (v,v*) for all v € V. Assign this edge 
the convex function w|x|1, and for every other original edge e € E, assign it the convex function 
Cefe, and restrict f > 0 (eg. using a logarithmic barrier). Finally, force f to to have 0 demand 
on the graph G with the extra vertex v*. This is now a clearly equivalent flow problem. As we 
have already described the self-concordant barriers for linear objectives, fe > 0, and g-norms in 
the proof of Theorem 10.14, we get: 


Theorem 10.15. Given a directed acyclic graph G, vector y € RY, and p € [1,00], we can compute 
in m+) time vertex potentials x with Ba > 0 and for any constant C > 0 


= < i x = C ; 
læ- gllp < gin, lla — yllp + exp(— log” m) 


Next we discuss the pair of problems of entropy-regularized optimal transport /min-cost flow 
and matrix scaling, which are duals. The former problem is 


min Cefe + fe log fe. 
B' f=d ec 


The matrix scaling problem asks to given a matrix A € R%5” with non-negative entries to compute 
positive diagonal matrices X, Y so that XAY is doubly stochastic, i.e. all row and column sums 
are exactly 1. By [CMTV17, Theorem 4.6], it suffices to minimize the objective 


n n 


5 Aj; exp(x; — yj) — Xz = >. Ui 


(i,9):Aiy £0 i=l i=l 
to high accuracy. Consider turning the pair (i, j) with Aj; 4 0 into an edge e = (i,j +n) in a graph 
G with 2n vertices with weight we = Ajj. Let z= : be the concatenation of the x,y vectors. 


Then the above problem becomes )?.¢p(G) We €XP(Zi — zj) — y72", zi. The optimality conditions 
for this objective are B' Wexp(Bz) = d, where d is +1 on the vertices {1,...,n} and —1 on 
{n+1,...,2n}. Rearranging this gives us B! f = d for f = Wexp(Bz), or Bz = log(W~'f). 
This is the exact optimality condition for the flow problem 


min (-1 — log We) fe + fe log( fe), 


which is exactly entropy-regularized optimal transport for Ce = (—1 — log we). If the entries of 


A are polynomially lower and upper bounded, then given an (almost) optimal flow f, we know 
by KKT conditions that f = W exp(Bz) for some dual variable z which we can then efficiently 
recover. So it suffices to give high-accuracy algorithms for the entropy regularized OT problem. 


Theorem 10.16. Given a graph G = (V, E), demands d, costs c € R®, and weights we € R>o, all 
bounded by exp(log?® m), let h(f) E eer Cefe + Wefe log fe. Then in m+) time we can find 
a flow f with B'f = d and for any constant C > 0 


h(f) < re h(f*) + exp(— log? m). 
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Proof. By splitting the edge e into two edges, we can handle the terms c.f. and wef log fe sepa- 
def 


rately. For the ce fe term, we can handle it using the self-concordant barrier w(x, y) = —log(y—cex). 


def 


For the term wef log fe, we use the 2-self-concordant barrier w(x, y) log x — log(y — z log x) 
[Nem04, Example 9.2.4]. As in the proof of Theorem 10.14, all items of Assumption 10.2 hold 
directly, except that we must compute V7(z, y). 


1 J (1+log z)? $ 1 1+log x 
2 — | a? y—2 log x)? x(y—2z log x) y—a log x)? 
V p(x, y) = ( _ ies l 1 
(y-a log x)? (y—2 log x)? 


Indeed, if — log x, —log(y — rlogx) < O(1), and |z|,|y| < exp(log? m), then all terms in the 
Hessian are quasipolynomially bounded. This verifies item 6 of Assumption 10.2, so the result 
follows by Theorem 10.13. 


Corollary 10.17 (Matrix scaling). Given a matric A € RS” whose nonzero entries are in 
mO nPD], we can find in time nnz(A)!+° positive diagonal matrices X,Y such that all 
row and column sums of XAY are within exp(— log n) of 1 for any constant C > 0. 
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A Previous Works 


We give a brief overview of the many approaches toward the max-flow and min-cost flow problems. 
A more detailed description of many of these approaches, and more, can be found the CACM article 
by Goldberg and Tarjan [GT14]. As there is a vast literature on flow algorithms, this list is by no 
means complete: we plan to update this section in subsequent works, and would greatly appreciate 
any pointers. 


A.1l Maximum Flow 


The max-flow problem, and its dual, the min-cut problem were first studied by Dantzig [Dan51], 
who gave an O(mn?U) time algorithm. Ford and Fulkerson introduced the notion of residual graphs 
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and augmenting paths, and showed the convergence of the successive augmentation algorithm via 
the max-flow min-cut theorem [FF54]. 

Proving faster convergence of flow augmentations has received much attention since the 1970s 
due to weighted network flow being a special case of linear programs. Works by Edmonds- 
Karp [EK72] and Karzanov [Kar73] gave weakly, as well as strongly polynomial time algorithms 
for finding maximum flows on capacitated graphs. 

Partly due to the connection with linear programming, the strongly polynomial case, as well as 
its generalizations to min-cost flows and lossy generalized flows, subsequently received significantly 
more attention. 

To date, there have been three main approaches for solving max-flow in the strongly-polynomial 
setting: 


1. Augmenting paths [EK72; Kar73; Din70; Din73; GG88; BK04]. 
2. Push-relabel [GT88b; Gol08; GHKKTW15; OG21]. 
3. Pseudo-flows [Hoc08; CH09; FHM10]. 


These flow algorithms in turn motivated the study of dynamic tree data structures [GN80], 
which allows for the quick identification of bottleneck edges in dynamically changing trees. Suitably 
applying these dynamic trees gives a max-flow algorithm in the strongly-polynomial setting with 
runtime O(nm), which is within polylog factors of the flow decomposition barrier. This barrier 
lower bounds the combinatorial complexity of representing the final flow as a collection of paths. 

Obtaining faster algorithms hinge strongly upon handling paths using data structures and 
measuring progress more numerically [EK72; Gab85; GR98; DS08]. Such views date back to the 
Edmonds-Karp [EK72] weakly polynomial algorithm based on finding bottleneck shortest paths 
which takes O(m? log U) time. Karzanov [Kar73] and independently Even-Tarjan [ET75] further 
showed that in unit capacity graphs, maximum flow can be solved in time O(m min( ym, n?/3)) by 
combining a fast bottleneck finding approach with a dual-based convergence argument. A related 
algorithm by Hopcroft-Karp [HK73] showed that maximum bipartite matching can be solved in 
O(m/n) time. 

Our algorithm in some sense can be viewed as implementing a data structure that identifies 
approximate bottlenecks in n°) time per update, except we use a much more complicated def- 
inition of ‘bottleneck’ motivated by interior point methods. Subquadratic running times using 
numerical methods started with the study of scaling algorithms for weighted matchings [Gab85] 
and negative length shortest paths/negative cycle detection [Gol95]. In these directions, Goldberg 
and Rao [GR98] used binary blocking flows to obtain a runtime of O(m min( ym, n2/ log U) for 
max-flow. 

More systematic studies of numerical approaches to network flows took place via the Lapla- 
cian paradigm [ST04]. Daitch and Spielman [DS08] made the critical observation that when inte- 
rior point methods are applied to single commodity flow problems, the linear systems that arise 
are graph Laplacians, which can be solved in nearly-linear time [ST04]. This immediately im- 
plied O(m! log U) time algorithms for min-cost flow problems with integral costs/capacities, and 
provided the foundations for further improvements. Christiano, Kelner, Madry, Spielman, and 
Teng then gave the first exponent beyond 1.5 for max-flow: an algorithm that computes (1 + €)- 
approximate max-flows in undirected graphs in time O(mn!/3<~8/9) [CKMST11]. This motivated 
substantial progress on numerically driven flow algorithms, which broadly fall into two categories: 


e Obtaining faster approximation algorithms for max-flow and its multi-ccommodity general- 
izations in undirected graphs through first-order methods [KMP12; She13; KLOS14; Pen16; 
Shel7], leading to a runtime of O(me~') for (1 + €)-approximate max-flow. 
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e Reducing the iteration complexity of high accuracy methods such as interior point methods: 
from m!/? to n!/2, or m!/3+°0) for unit capacity max-flow [(CKMST11; Mad13; LS19; Mad16; 
LS19; KLS20]. 


Over the past two years, further progress took place via data structured tailored to electrical 
flows arising in interior point methods. These led to near-optimal runtimes for max-flow and min- 
cost flow on dense graphs [BLNPSSSW20; BLLSSSW21] as well as improvements over mt’ in 
sparse capacitated settings [GLP21; BGJLLPS21]. Our approach broadly falls into this category, 
except we use dynamic tree-like data structures as the starting point as opposed to electrical flow 
data structures, and modify our interior point methods towards them. Notably, we use interior 
point methods based on undirected min-ratio cycles instead of electrical flows. Hence, our methods 
use Q(m) iterations instead of m!/? or n'/? which is common to all algorithms subsequent to [DS08]. 


A.2 Minimum-Cost Flows 


Work on the minimum cost flow problem can be traced back to the Hungarian algorithm for the 
assignment problem {[Kuh12]. This problem is a special case of minimum cost flow on bipartite 
graphs with unit capacity edges. When generalized to graphs with arbitrary integer capcities, the 
algorithm runs in O((n+F)m) time where F is the total units of flow sent. Algorithms with similar 
running time guarantees include many variants of network simplex [AGOT92], and the out-of-kilter 
algorithm [Ful61]. 

Strongly polynomial time algorithms for min-cost flows have been extensively studied [Tar85; 
GT88a; OPT93; Orl93; Orl96], with the fastest runtime also about O(nm). Many these algorithms 
are also based on augmenting minimum mean cycles, which are closely related to our undirected 
minimum-ratio cycles. However, the admissible cycles in these algorithms are directed, and their 
analysis are with obtaining strongly polynomial time as goal. 

The assignment problem has been a focal point for studying scaling algorithms that obtain high 
accuracy solutions numerically [GT87; GT89b; DPS18]. This is partly due to the negative-length 
shortest path problem also reducing to it [Gab85; Gol95]. These scaling algorithms obtain runtimes 
of the form of O(m!’ log U), but also extend to matching problems on non-bipartite graphs [Gab85; 
DPS18]. However, to date scaling arguments tend to work on only one of capacities or costs (similar 
to the reductions in Appendix C), and all previous runtimes beyond the O(nm) flow decomposition 
barrier for computing minimum-cost flows have been via interior point methods [DS08; LS19; 
BLLSSSW21; AMV21; BGJLLPS21]. 


B Omitted Proofs 


B.1 Proof of Lemma 4.12 


Proof. The graph G will have one more vertex than G, denoted by v*. Additionally, we will add a 
directed edge between v* and v for all v € V(G), where the direction will be decided later. Thus, 
G will have at most m+n edges. 

Initially define Yc = (uz + ut)/2 for alle € E(G). However, fŒ") will not route the 
demand d, and we denote the demand it routes by d “Bl f nit), Now, we will describe how to 
generate the edge between v* and v. If dy = dy, then we do not add any edge between v and v*. If 
d, > dy, ae add an edge e, = (v > v*) with upper capacity 2(d, — dy) (and lower capacity 0), 

init 


and set fi = d, — d,. If d, < d, then add an edge e, = (v* — v) of upper capacity 2(d, — dy) 


v 


92 


(and lower capacity 0), and set pind = d, —d,. Finally, set dy» = 0 and d, = dy for all v € V(G), 


and Če, = 4mU? for the new edges e,, and Če = Ce for all e € E(G). 

It is direct to check that all capacities of edges in G are integral and bounded by 2mU, and 
costs are bounded by 4mU?. It suffices then to prove that if G supports a feasible flow, then the 
optimal flow in G must put 0 units on any of the e, edges. Indeed, note that the e, edges only can 
contribute nonnegative cost as fe, > 0 by definition, and if any of them supports one unit of flow 
(i.e. fe, > 1), then that contributes 4mU? to the cost. The maximum possible cost of edges e € E 
is bounded by mU?, as there are m edges of capacity at most U, each with cost at most U, and 
the minimum is at least —mU?. Hence if G supports a feasible flow, fe, = 0 for all the e, edges. 

We conclude by bounding ®(f"*)), Note that f- are all half-integral, and fe < mU for all e € 
E(G). Also, F* > —mU? by the above discussion, and because all costs are bounded by 4mU?, 
a! fiit) < 2m -4mU? - mU = 8m3U3. Thus 


re 
&( FO") < 20m log(8mFU? + mU?) + > ((1/2)~* + (1/2)~%) < 200m log mU, 
eck 


where (i) used the above bounds and that ut — fond, fon) — uz, are all half-integral. 


B.2 Proof of Theorem 5.11 


The goal of this section is to show Theorem 5.11. To obtain our result, we require the following 
result from [SW19]. 


Theorem B.1 (see [SW19, Theorem 1.2]). Given an unweighted, undirected m-edge graph G, 
there is an algorithm that finds a partition of V(G) into sets Vi, V2,...,V_ such that for each 
1 < j < k, G[Vj] is a p-expander for y = Q(1/log*(m)) and there are at most m/4 edges that 
are not contained in one of the expander graphs. The algorithm runs in time O(mlog’(m)) and 
succeeds with probability at least 1—n~© for any constant C fixed before the start of the algorithm. 


We run Algorithm 8 (given below) to obtain the graphs G; as desired in Theorem 5.11. 


Algorithm 8: DECOMPOSE(G) 

1 L+ [logs Amaz(G)] +1;Ge=G 

2 for i= £, — 1,...,1 do 

3 Let GY denote the graph G; after adding 2° self-loops to each vertex. 

4 Compute an expander decomposition Vo, Vj,..., Vk of GY as described in Theorem B.1. 


5 Git — (Uo<ice Eg, (V; V \ Vj)). 
6 Gi + Gi \ Gi-1. 


Claim B.2. For each i, the graph G; has at initialization in Line 1 or Line 5 at most Ën edges. 


Proof. We prove by induction on i. For the base case, i = £, observe that 2 > Amaz(G) and since 
Gu is a subgraph of G, we have |E(G)| < 2°n. 

For i |> i — 1, we observe that G; is unchanged since its initialization until at least after Gi_ 
was defined in Line 5. Thus, using the induction hypothesis and the fact above, we can conclude 
that GY (defined in Line 3) consists of at most 2'n edges from G; plus 2fn edges from all self-loops. 
But by Theorem B.1, this implies that |Uo<j<4 Za;(Vji,V \ VDI = |Uo<j<n Eco(Vj, V \Vi)| < 


2i+in/4 = 271p, and since this is exactly the edge set of Gi—1, the claim follows. 
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Proof of Theorem 5.11. Using Claim B.2 and the insight that each graph Gj, after initialization, 
can only have edges deleted from it, we conclude that |E(G;)| < 2'n for each i. 

For the minimum degree property of each G; with i > 0, we observe by Theorem B.1, that for 
G® and vertex v in expander Vj, degg, (v) = |Ea,(v, Vj \ {v})| = |Ego(v, Vj \ {v})| yz. 


B.3 Proof of Lemma 6.5 


We show Lemma 6.5 using the following steps. First, we assume for the majority of the section 
that the weights v = 1, i.e. the all ones vector. We explain later a standard reduction to this 
case. Given a low stretch tree T on a graph with lengths £, and a target set of roots R, we explain 
how to find a forest F (depending on R) that has low total stretch (Definition 6.4). This involves 
defining a notion of congestion on edges e € F(T). Then we explain how to handle dynamic edge 
insertions and deletions by adding new roots to the tree, and decrementally maintain the forest. 
The trickiest part is to explain how to add roots so that we can return valid stretch overestimates. 
At a high level, this is done by computing a heavy-light decomposition of T, and using it to inform 
our root insertions. 

It is useful to maintain the invariant that our set of roots is branch-free at all times, i.e. that the 
lowest common ancestor (LCA) of any two roots r1,r2 € R is also in R. This is necessary to make 
it easier to construct the forest F given R. Intuitively, forcing our set of roots to be branch-free is 
not a big restriction, as any set of roots can be made branch free by at most doubling its size. 


Definition B.3 (Branch-free sets). For a rooted tree T on vertices V, we say that a set RC V is 
branch-free if the LCA of any vertices r1,rg E€ R is also in R. 


Given a branch-free set of roots R, we build a forest F in the following way. We start with some 
total ordering/permutation 7 on the tree edges E(T), and for any two “adjacent” roots r1,r2 € R, 
we delete the smallest edge with respect to m from T. Here, adjacent means that no root is on 
the path between r1,r2. It is crucial in this construction that R is branch-free, so that there are 
exactly |R| — 1 adjacent pairs of roots. 


Definition B.4 (Forest given roots). Given a rooted tree T, a branch-free set of roots RC V, and 
a total ordering n on E(T), define Fr(R, m) as the forest obtained by removing the smallest tree 
edge with respect to n from every path between adjacent roots in T. 


It is direct to verify that Fr(R, 7) has exactly |R| connected components, each of which contains 
a unique vertex in R. We now explain how to construct m. m sorts the edges by their congestions. 


Definition B.5 (Congestion). Given a graph G = (V,E) with lengths £, tree T we define the 
congestions of edges e € E(T) as 


cong?” = 5 1/@er. 


e'=(u,v)EE(G) 
s.t. e€T[u,v] 


We show that if 7 is ordered by increasing congestions, then F = Fr(R, 7) has low total stretch. 


Lemma B.6 (Valid 7). For a graph G = (V, E) with lengths £, a rooted tree T, and a branch-free 
set of roots R, let n be a total ordering on E(T) sorted by increasing cong!" (Definition B.5). Then 
for F = Fy(R,n), we have Yee pstrh® < 2d ep str”. 
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Proof. Let Ê be the set of edges deleted from E(T) to get Fr(R, r). For an edge e € Ê on a path 
between adjacent roots ri(e),ra(e) € R, define Le = (€,|p(T[ri(e), r2(e)])|) as the length of the 
path between rj(e),7r2(e) in T. First note by Definition 6.4 that for an edge e’ = (u,v) € E(G) 
sir < stri" + 5 Lefke. (55) 
ec ENT [u,v] 


Thus, we can bound 


(i) 


5 str’ < 5 stra + 5 XO Lefke 


e'€E(G) e'€E(G) e/=(u,v)EE(G) ec ENT [u,v] 

= 5 str” F 5 Lecong?® = 5 str?" F 5 5 Lpcongi 
e'EE(G) ec È e/€E(G) ec E SET[ri(e),re(e)] 

(ii) (iii) 

< 5 strif + 5 5 éycong;* < 5 stri + 5 Llecongi” 
e'EE(G) ec R FET[rı(e);r2(e)] e'€E(G) e€E(T) 

=2 5 str. 

e'€E(G) 


Here (i) follows from (55), (ii) follows from the fact that m is sorted by increasing cong!’ so 


cong? < congp” for all f € T[ri(e), r2(e)], and (iii) follows as T[r1(e), r2(e)] are disjoint paths. 


To handle item 4 of Lemma 6.5, we initialize the set of roots R to have size O(m/k) to already 
satisfy item 4. This set of roots R exists by a standard decomposition result due to [ST04]. 


Lemma B.7 (Tree Decomposition, [ST03; ST04]). There is a deterministic linear-time algorithm 
that on a graph G = (V,E) with weights w E€ REg a rooted spanning tree T, and a reduction 
parameter k, outputs a decomposition W of T into edge-disjoint sub-trees such that: 


1. |W| = O(m/k). 

2 RŽƏWCYV, defined as the subset of vertices appear in multiple components, is branch-free. 

3. For every component C C V of W, the total weight of edges adjacent to non-boundary vertices 
of C is at most O(||wl|,-k/m), ie. Veencgaw We < 40: ||w]], - k/m. 


Item 4 of Lemma 6.5 then follows from Lemma B.7 by taking we = 1 for all e € E. 

We now explain how to add roots to R under insertions and deletions of edges e = (u, v). While 
a naïve approach is to add both u,v to the set R, i.e. R + RU {u,v} (and then add more roots 
to make it branch free), this does not work because str? might fluctuate significantly, and not be 
globally upper bounded as we want. To fix this, we introduce a more complex procedure that adds 
O(1) additional roots to R to control the number of potential roots that an edge e is assigned too. 

Formally, we will construct an auxiliary tree with the same root and vertex set as T. This tree 
is constructed via a heavy-light decomposition on T. Then we replace each heavy chain with a 
balanced binary tree. Thus, this auxiliary tree has height O(log? n). When a vertex u is added to 
R, we walk up the tree induced by the heavy-light decomposition and add all the ancestors of u. 
Thus, at most O(log? n) additional vertices will be added to R, but each edge will only be assigned 
to O(log? n) distinct roots. 

We introduce one additional piece of notation. Given a rooted tree Ty (the tree defined by the 
heavy-light decomposition on T) and vertex u, define u'7” as the set of its ancestors in Ty plus 
itself. We extend the notation to any subset of vertices by defining R'” = Unuer ult, 
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Lemma B.8 (Heavy-Light Decomposition of Trees, [ST83]). There is a linear-time algorithm that 
given a rooted tree T with n vertices outputs a collection of vertex disjoint tree paths {P,,...,P:} 
(called heavy chains), such that the following hold for every vertex u: 


1. There is exactly one heavy chain P; containing u. 
2. If P; is the heavy chain containing u, at most one child of u is in P;. 
3. There are at most O(log n) heavy chains that intersect with ut”. 
In addition, edges that are not covered by any heavy chain are called light edges. 


Lemma B.9. There is a linear-time algorithm that given a tree T rooted at r with n vertices 
outputs a rooted tree Ty supported on the same vertex set such that 


1. The height of Ty is O(log? n). 
2. For any subset of vertices R in T, R'™ is branch-free in T. 


3. Given any total ordering on tree edges n and nonempty verter subset R C V(T), root® € ult 
for every vertex u where F = Fr(R',7). 


4. Given any total ordering on tree edges m and nonempty vertex subsets Ri, Ro C V(T), 
root”! = root? if RH Nuu = RI Nuu where F; = Fr(RM* 1), i = 1,2. That 
is, the root of u in any rooted spanning forest of the form Fr(R', r) is determined by the 
intersection of u’s ancestors and forest roots. 


Proof. We first present the construction of the rooted tree Ty. We root Ty at the root r of T 
and compute its heavy-light decomposition in linear-time via Lemma B.8. Let {P,,...,P:} be the 
resulting decomposition. For every path P;, we build a balanced binary search tree (BST) T; over 
its vertices, V(P;), with respect to their depth in T. The depth of a vertex in T is defined as the 
distance to the root r. In addition, we make the vertex with minimum depth the root of the BST 
T;. Ty is then obtained from T by replacing every path P; by BST T;. 

To show condition 1, observe that the path Ty[u,r] consists of O(log n) node-to-root paths in 
some balanced BSTs and O(log n) light edges. Each node-to-root path in some balanced BST has 
length at most O(log n). Therefore, the Ty[u, r]-path has length at most O(log? n). 

Next, we prove condition 2. For any two vertices u and v in R'™™*, let w be their lowest common 
ancestor in T. Let Py be the heavy chain containing w. Thus, for at least one of T[u,r] and T|v,r], 
w must be the first vertex of P, that appears on that path or else w has two distinct children that 
belong to P,,. Then, w appears in either Ty[u,r] or Ty[v,r] as well and thus is included in R'*. 

We prove Condition 3 by induction on the depth of u in Ty, i.e. the size of u'™. If |u| = 1, 
u is the root of Ty and R'* contains u for any nonempty R. Thus, root! is u itself. Next, we 
consider the case where |u'?#| = k + 1. Let v be the first vertex in R'* on the path Ty[u, r]. Let 
P be the heavy chain containing v and b be the first vertex in P on the path Ty|u,r]. 

If P does not contain u, let a be the vertex before b on the path Tylu, r]. The sequence u, a,b, v 
shows up in the same order as in the path Ty[u, r]. Since R'# does not contain a, R'* does not 
contain any vertex in the subtree of Ty rooted at a as well as the subtree of T rooted at a. Thus, u, a, 
and b are connected in the forest F(R‘, 7) and share the same root. The size of bt" is less than 
the size of u'™ and we can apply induction hypothesis to argue that root = root} E bH Ç un, 

If P contains u, we will show that u is connected to some other vertex w € P N uff! in 
the rooted forest Fr( RT, r). Let C be the set of vertices in P connected to u. Observe that C 
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forms a contiguous subpath of P and contains one root from RÎT®. Recall that the subtree in Ty 
corresponding to P is a balanced binary search tree keyed by depth in T. Let B be such binary 
search tree. It is known that given a binary search tree and a range on keys, the set of nodes in the 
tree within the range is closed under taking lowest common ancestor. Let w be the lowest common 
ancestor of all vertices in C in the BST B. w must be an element of RT! and therefore root® = w. 
This concludes the proof of Condition 3. 

To prove Condition 4, it suffices to argue the case where Rp = Rı U {r} and RITH contains 
every ancestor of r. Specifically, we prove that rootł? = root‘ for every vertex u which does not 
have r as its ancestor. Let C be the component of Fy in which r lives and w be the root of C. 
Adding r as a new root removes some edge between r and w and divides C into two components 
Cı and C2. Suppose that r € Cı and w € C2. Condition 3 says that r is ancestor w.r.t. Ty to every 
vertex in C1. However, only vertices in C4 have their root changed. This concludes the proof of 
Condition 4. 


We now provide an algorithm for Lemma 6.5 and prove that it works. At a high level, the 
algorithm will first compute an LSST. Then it will compute global stretch overestimates based on 
the tree Ty from Lemma B.9. Then, it will initialize a set of roots of size O(m/k) by adding all 
endpoints of large stretch edges as terminals, and by calling Lemma B.7 to bound the degree of 
each component of F as needed in item 4 of Lemma 6.5. 


Proof of Lemma 6.5. For the weights v, we first construct a graph G, that has [mv,/||v||1] un- 
weighted copies of the edge e for each e € E(G). Note that G, has at most 2m edges: 


D rat] < (1+ ee) 2m 


2- loli | = 207 Tel 


Let T be a LSST on Gy computed using Theorem 3.2, with an arbitrarily chosen root r, and let TH 
be the tree in Lemma B.9. Let m be the permutation sorted by increasing congestions (Lemma B.6). 
Notice that G C Ge, and thus every spanning tree/forest of Gy is also a spanning tree/forest of 
G. Furthermore, either the tree stretch or forest stretch of edge e € E(G) is equal to the one of any 
of e’s copy in Gy. 
We now explain how to compute stretch overestimates Stre. For i > 0, let B; be the set of 
vertices within distance i of the root r in Ty. Let D = O(log? n) be the height of Ty. We define 


D 
Stra = 2) see Pe (56) 
i=0 


In this definition, Stre take identical values among copies of e in Gy. By Lemma B.6 we know that 


D 


5 Stre = 25 5 strfr (Bim). < O(D . MYLSST) = O(MYLSST log? n). (57) 
e€E(Gy) i=0 e€E(Gy) 


Thus the total stretch bound is fine. Shortly, we will explain the full algorithm for maintaining the 
set of roots and why Stre are valid stretch overestimates for our algorithm. 

To explain how we maintain the set of roots, we first explain how to initialize a set of roots. 
First run Lemma B.7 on T with uniform weights we = 1 for all e € E(G) to output a set |W7| = 
O(m/(k log? n)), and such that each component has total adjacent weight k (minus the boundaries), 
i.e. at most k adjacent edges (as we = 1 for all e). Start by defining Rọ «+ OW. So far, 
|Ro| = O(m/(Kelog” n). 
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Also for any edge e € E(G) with stre > O(kyLssr log n), add both endpoints of e to Ro. As 


es MVe | ~ = 
5 Stre < X Tat | stre = 5 Stre < O(myrssr log? n), 


eC E(G) e€ E(G) e€E(Gy) 


Markov’s inequality tells us that the number of edges e € E(G) with stre > O(kyzssr logt n) is 
bounded by O(m/(klog? n)). Thus, overall |Ro| = O(m/(k log? n)). Now, define our initial sets 
of branch-free roots as R= RI" . Because the height of Ty is O(log? n) (Lemma B.9), we know 
|R| = O(m/k). 

We now handle edge insertions and deletions. When an edge e = (u,v) is inserted or deleted, 
we add ul? Uv!" to R, i.e. RO RU (utf UvtT™), This is branch free by Lemma B.9. If e was 
inserted, assign it to have Stre = 1, as both endpoints are roots in R. We also update the forest 
F © Fr(R,r). Because R is incremental and 7 is a total ordering, F is decremental. 

We now verify all items of Lemma 6.5. Item 1 follows because initially |R| = O(m/k), and the 
height of Ty is O(log? n), so each edge insertion /deletion increases the size of R by O(log? n). Item 
3 follows because 


T nies leh y [ree 


m 


l Sre < SO are = O(lullyyzssr log? n), 
e€E(G) e€ E(G) 


where the last inequality follows from (57). 

Let Fy = Fy(R,7) be the initial rooted spanning forest to be output. Let W be a refinement 
of WT induced by the connectivity in Fy. We output W as the desired edge-disjoint partition of 
Fo into O(m/k) subtrees. W contains at most O(m/k) subtrees because Fo is obtained from T by 
removing |R| — 1 edges and WT is a edge-disjoint partition of T. Item 4 follows because R D OW, 
where W was a partition of G into pieces of total degree O(k). 

We conclude by checking item 2, i.e. that stre upper-bounds str at any moment for every 
edge e = (u,v). If u,v are in the same connected component of F, then str’ = str?’ < stre 
by noting that T = F,(Bi ,7). In the other case where u and v are disconnected, let RI” 
be the current set of roots of F. There must be some non-negative integer i (and j) such that 
ulu n RM = utt! A B; (and vt n RTE = ya f B; respectively). To finish, note that item 4 
of Lemma B.9 ensures that 


l Fj 
root? = root!, root* = root’, and therefore 


Fl 3 
str?” < str" + stre?” < stre. 


Finally, it can be checked that the total runtime is O(m), as every operation can be implemented 
efficiently. 


B.4 Proof of Lemma 6.6 
Proof. Let W = O(yLssr log? n) be such that items 2, 3 of Lemma 6.5 imply 


5 vestte < W |lv||,, and 
ecE 
max Stre < kW log? n. 
ech 


Let t = 10kW log? n = O(k). The algorithm sequentially constructs edge weights v1,...,v; in a 


multiplicative weight update fashion and trees T;,...,7;, forests F,,...,F;, and stretch overesti- 
mates str ,...,str via Lemma 6.5. 
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Initially, vı = 1 is an the all 1’s vector. After computing T;, vi+1 is defined as 


e str 1 ~ į 
Vise Vi e €EXp (=) = exp k > r) for alle € E. 
j=l 


Finally we define the distribution A to be uniform over the set {1,..., ¢}. 
To show the desired bound (25), we first relate it with ||v:+1|| using the following: 


1 


t t 
ZG 1 aH 
— tre <1 = t = log ||v 
max = 2 re < log (> exp (; 2 a) og [velli > 


where v41 is defined similarly even though it is never used in the algorithm. 
Next, we upper bounds ||v,||, inductively for every i = 1,...,t + 1. Initially, vı = 1 and we 
have ||villı = m. To bound ||vi+ı||, we plug in the definition and have the following: 


a4 


=e 
str str 
leili => veep ( z) <> Vi,e (142 rs) 
e € 


2 ~i 2 2W 
= [oil + $D viest: < [lly + TH Josh = (1+ ŽE) feih 
e 


where the first inequality comes from the bound ir < kW =0.1t and e” < 1 +2z for 0 < x < 0.1. 
Applying the inequality iteratively yields 


Ii 2W\ t 
exp (sa ys) = Urtie < ||Vitill, < (1 + s) vill < exp(2W)m. 
i=l 


The desired bound (25) now follows by taking the logarithm of both sides. 


C Cost and Capacity Scaling for Min-Cost Flows 


In this section, we describe a cost and capacity scaling scheme [Gab85; GT89a; AGOT92] that 
reduces the min-cost flow problem to O(log mU log C) instances with polynomially bounded cost 
and capacity. We prove the following lemma: 


Lemma C.1. Suppose there is an algorithm A that solves (8) on any m-edge graph and poly(m)- 
bounded integral demands, costs, and lower/upper capacities in T4(m) time. There is an algorithm 
that on a graph G = (V, E) and a min-cost flow instance T = (G, d,c, u~, u™) with integral demands 
d, integral lower/upper capacities u-,ut € {—-U,...,U}", and integral costs e € {—C,...,C}*, 

solves T exactly in O(T.4(m) log mlog mU log C)-time. 

Instead of (8), we consider the equivalent min-cost circulation problem: 
min c! f, (58) 
B! f=0 
0< fe<ue for all e€ E 


where cost c € {—C,...,C}Ë and capacity u € {1,...,U}”. It satisfies strong duality with dual 
problem: 


u. (59) 


Given a (directed) graph G = (V, E) with costs c, capacities u, and some feasible circulation f 
to (58), we can define its residual graph G(f) = (V, E(f)) with costs c(f), and capacities u(f) as 
follows. For any arc (directed edge) e = (u,v) € E, we include e with cost Ce and capacity ue — fe 
if it’s not saturated, i.e. fe < Ue. We also include its reverse arc rev(e) = (v,u) with cost —ce 
and capacity fe if fe > 0. Given any directed graph G, we use B(G) € {—1,0,1}”*" to denote its 
edge-vertex incidence matrix that respects the edge orientation. 

Given some positive integers m, C, U, we define Tycc(m, C, U) to be the time for exactly solving 
(58) on a graph of at most m arcs with costs c € {—C,...,C}”, capacities u € {1,...,U}” w.hp.. 
A direct implication of Theorem 1.1 shows that 


Corollary C.2. Tycc(m, poly(m), poly(m)) = m!te(). 


C.1 Reduction to Polynomially Bounded Cost Instances 


In this section, we present a cost scaling scheme (Algorithm 9) for reducing to O(log C) instances 
with polynomially bounded cost. 


Lemma C.3. Suppose there is an algorithm A that gives an integral exact minimizer to (58) 
on any m-edge graph and m'°-bounded integral costs, U-bounded integral capacities in T4(m,U) 
time. Algorithm 9 takes as input a graph G = (V, E) and a instance of (58) T = (G,c,u) with costs 
ce {-C,...,C}" and capacities u € {1,...,U}", solves T exactly in O(T4(m, U) log C+m log C)- 
time. In other words, Tuca(m, C,U) = O((Tucc(m, m!, U) + m) log C).° 


Algorithm 9: Cost Scaling Scheme for Solving (58) 

1 procedure CosTSCALING(G = (V, E),c € {-C,...,C}",we {1,...,U}*) 

2 | fOeco. 

3 T + O(log C) 

4 for t = 0,...,T — 1 do 

5 Let G(f™), e(f), u(f) be the cost rounded residual graph (Definition C.6) of 
fF, 5 

Solve (58) on G(f™), ef), u( fF) 

Let Ay be the primal optimal. 

Extract dual optimal A, via Lemma C.9. 

FED e FOLAS 

10 yD y® + Ay 


11 | Output fD 


oan om 


~ 


In (58), the problem is equivalent under any perturbation to the cost with By for any real 
vector y € RY. To see this, given any circulation f (not even feasible), the cost c! f is equal to 
(c — By)! f for any y because B' f = 0. Given such y, we define the reduced cost of c w.r.t. y as 
c— By. 

Here we introduce the idea of €-optimality which will be used to characterize exact minimizers 
to integral instance of (58). 


Definition C.4. Given a parameter € > 0, and a feasible circulation f to (58), we say f is =- 
optimal if there is some vertex potential y € RY such that mine(e(f)—B(f)y)e > —e, where G(f) 


°In the proof, we do not make any effort on reducing the exponent of the polynomial bound on costs. 
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is the residual graph, c(f) is the residual cost w.r.t. f, and B(f) is the edge-vertex incidence 
matrix of G(f). 


From Definition C.4, the 0 circulation is C-optimal as initial cost of any edge is at least —C. In 
the integral case, a flow f is an exact minimizer if it is 1/(n + 1)-optimal: 


Lemma C.5. In the case where the costs c in (58) is from {—C,...,C}”, a feasible integral 
circulation f is an exact minimizer if it is 1/(n + 1)-optimal. 


Proof. Let € = c(f) — B(f)y be the witness of 1/(n + 1)-optimality of f. The cost of every 
circulation in G(f) is identical between c(f) and ĉ. In the residual network of f, every simple cycle 
has cost at least —n/(n + 1) due to mine ĉe > —1/(n +1). However, since f is integral, so do its 
residual graph, costs, and capacities. Every negative cycle in G(f) should have cost at most —1. 
The fact that every simple cycle in G(f) has cost at least —n/(n +1) > —1 denies the existence of 
negative cycles in G(f). Thus, 0 is the exact minimizer to the residual problem w.r.t. f and f is 


an optimal solution to (58). 


Intuitively, Algorithm 9 works by computing a augmenting flow A given a ¢-optimal solution 
f such that f + A is (€/2)-optimal. Thus, the algorithm runs for O(log C + log n) iterations until 
it reaches a 1/(n + 1)-optimal solution. A is computed via solving (58) with costs rounded to 
polynomial size. That is, given an integral feasible circulation f, we define a rounded residual 
graph G ¢ as follows: 


Definition C.6. Given a ¢-optimal integral circulation f w.r.t. a vertex potential y € RY, we 
define its cost rounded residual graph G(f) = (V,E(f)) with costs &(f) and capacities u( f) as 
follows: Let c(f) be the reduced cost of c(f) w.r.t. y, ie. C(f) = e(f) —B(f)y. For any arc 
e= (u,v) € G(f), include e in G(f) with the same residual capacity u(f)e and cost č(f)e obtained 
by rounding El f)e to the nearest integral multiple of e/m’. 


Remark C.7. When solving (58) on G(f), we can always ignore edges whose cost is more than 
em. Any simple cycle containing such edge has non-negative cost because costs are at least —e. 
There will be an optimal solution that does not use any of such edge. 

Thus, every edge we care about is an integral multiple of e¢/m® within the range [—e,em]. Via 
dividing the costs by e/m®, all the costs are integers within {—m1°,...,m1°}. 


Given a current ¢-optimal integral circulation f w.r.t. y, Algorithm 9 finds a augmenting 
flow Ay via solving (58) on the instance Z = (Gf), c(f),u(f)) with polynomially bounded costs 
(Remark C.7). Let Ay be the corresponding dual (59) optimal to the instance Z. We show that 
f + Ag is ¢/2-optimal w.r.t. y+ Ay. This is formulated as the following lemma: 


Lemma C.8. Given an ¢€-optimal integral circulation f w.r.t. y, let Ag and Ay be the optimal 


primal dual solution to (58) on the instance T = (G(f),e(f),u(f)). f + Aş is €/2-optimal w.r.t. 
y+ Ay. 


Proof. Clearly, f + Af is an integral feasible circulation. Let A,,s~,s* be the corresponding dual 
solution to (59). We have that B(f)Ay + s7 — st = e(f). By complementary slackness, we know 
that for any arc e = (u,v) 


1. If Age < u(f)e, st =0, and 


2. If Aye > 0, sz =0. 
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For any arc e = (u, v), it is included in the residual graph G(f +A +) if Ay. < u(f)e. Therefore, 
we have st = 0 and (B(f)Ay)e < E(f)e. The reduced residual cost on e w.r.t. y+ Ay is 


C(f)e — (B(f)Ay)e = (fe -— fe => — 
Its reverse, rev(e) = (v, u), is included in the residual graph G(f + Af) if Az. > 0. Therefore, we 


have s~ = 0 and (B(f) Ay) reve) = —(B(f) Aye < —E( fe = C(F)rev(e). The reduced residual cost 
on rev(e) w.r.t. y + Ay is 


CUS )rev(e) = (B(f)Ay)rev(e) 2 (F )rev(e) E C(F )rev(e) 2 mE 


However, the algorithm implementing Theorem 1.1 only gives primal optimal solution. We need 
a separate routine for extracting the dual solution from the primal one. 


Lemma C.9. There is an algorithm that given an instance T = (G = (V, E),c, u) of (58) where 
G has m edges, costs c € {—C,...,C}”, and capacities u € {—U,...,U}, computes an optimal 
primal and dual solution f,y,s~,s* to (58) and (59) in O(Tmucc(m,C,U))-time. 


Proof. First, we compute f, the optimal primal solution to (58), in Tizcc(m, C,U)-time. Due to 
the optimality of f, the residual graph G(f) has no negative cycles. Then, we can compute a 
distance label on G(f) as follows: Add a supervertex s to G(f) with arcs toward every vertex in 
G(f) of 0 costs. Then, we can compute a shortest path tree rooted at s by solving an un-capacitated 
min-cost flow with demands ds = n, dy = —1,u € V in O(Tycc(m,C,U))-time using standard 
reduction. 

Let yu,u E€ V be the distance from s to u. Since y is a valid distance label on G(f), we have 
B(f)y < c(f) where B(f) is the edge-vertex incidence matrix of the residual graph G(f). Then, 
we will construct s~,s* > 0 such that (y,s~,s7) is the optimal dual solution. 

For any arc e = (u,v) € G, if 0 < fe < ue, we set both sz = st = 0. If fe = 0, we set 
8. = Ce — (Yv — Yu) and st = 0. If fe = ue, we set s7 = 0 and st = Yy — Yu — Ce- 

Next, we check that (y,s~,s*) is a feasible dual solution. For any arc e = (u,v) € G, if 
0 < fe < ue, both e and rev(e) appears in G(f). Thus, we have both yy—Yyu < Ce and Yu—Yv < — Ce 
and therefore Yy — Yu = Ce. Otherwise, Yy — Yu + S7 — st = ce holds by the definitions of s7 and 
St. 

Finally, we check that c' f = —st' u. Complementary slackness and the fact that f is a 
circulation yield 


y Bift+s 'ftstl(u-f)=0. 
Rearrangement yields 


—s''u = (By - s~ + 8st)' f =c' f, 


T); 


where the last equality comes from dual feasibility of (y, s7, s 


Proof of Lemma C.3. Initially, we start with a C-optimal flow f© = 0 € RF w.r.t. potential 
0 € RY. At any iteration t +1, ft) is an é/2-optimal flow w.r.t. yt) if f is an e-optimal 
flow w.r.t. y® (Lemma C.8). Thus, after T iterations, f) is an C/27-optimal flow w.r.t. y™. 
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Taking T = O(log C), we have C/27 < 1/(n+1) and hence f( is an optimal solution to (58) due 
to Lemma C.5. 

Every iteration, we solve for an optimal primal dual solution to (58) on a cost-rounded residual 
graph using Lemma C.9. Such instance has m edges, polynomially bounded costs (Remark C.7), U- 
bounded capacities and thus takes O(T4(m, U))-time given an algorithm A for solving (58). There is 
also an O(m)-overhead for constructing cost-rounded residual graph and updating f (+1) and y@t) 
in each iteration. Overall, the runtime is O(T/4(m, U) log C + mlog C) since C = Q(poly(m)). 


C.2 Reduction to Polynomially Bounded Capacity Instances 


It remains to address the case of min-cost flows with polynomially bounded costs, but possibly 
large capacity. Here we use capacity scaling. 


Lemma C.10. Suppose there is an algorithm A that gives an integral exact minimizer to (58) on 
any m-edge graph and m'°-bounded integral costs, m*°-bounded integral capacities in T4(m) time. 
Algorithm 10 takes as input a graph G = (V, E) and an instance of (58) T = (G,c,u) with costs 
cE {—m!9,...,m!°}" and capacities u € {1,...,U}”, solves T exactly in O(T.4(m) log mlog U + 


mlog mlog U)-time. In other word, Tycc(m,m!, U) = O(Tuca(m, m!, m*°) log m log mU). 


In each iteration, Algorithm 10 augments the current integral circulation f with A, a constant 
approximate integral solution to (58) on the residual graph. After O(log(CmU)) iterations, the 
optimal objective value on the residual graph is at most —0.1. This indicates that the value is 0 
and we have reached an optimal solution because the residual graph is always an integral instance 
and has optimal value either 0 or at most —1. 

To find a constant approximate solution, Algorithm 10 first finds a poly(m)-approximate so- 
lution of value —z,x > 0. Then, one can round the residual capacities down to integral multiples 
of z/poly(m) and show that the optimal solution to the rounded residual instance is a constant 
approximation. Solving (58) on the rounded residual instance is equivalent to solving with polyno- 
mially bounded capacities, which can be done using Corollary C.2. 

The first component is an algorithm that computes a poly(m)-approximate solution to (58). 
In particular, we find a Cm-approximate solution when the costs e € {—C,...,C}”. Given an 
instance Z = (G,c,u), Roughly speaking, the algorithm finds a negative weight cycle in G with 
largest bottleneck. This is done via performing binary search over the bottleneck, which has m 
different values, and then detecting negative cycle by solving an unit capacity version of (58). 


Lemma C.11. Suppose there is an algorithm A that gives an integral exact minimizer to (58) on 
any m-edge graph and m!°-bounded integral costs, m*°-bounded integral capacities in T.4(m) time. 
There is an algorithm that takes as input a graph G = (V, E) and a instance of (58) T = (G,c, u) 
with costs c € {—m!,...,m!}" and capacities u € {1,...,U}”, outputs an m!*-approrimate 


solution f such that 


where f* is the optimal solution to (58). The algorithm runs in O(L.4(m) log m)-time. 


Proof. For any directed cycle C in G, we define its bottleneck as u(C) = minecc ue. The algorithm 
finds a negative weighted cycle C* in G with maximum bottleneck. This is done by first performing 
binary search over all possible bottleneck capacities, which has m of them. Let u be the bottleneck 
capacity we want to check. We construct graph Gu from G by removing all edges with capacities 
smaller than u. Via a standard reduction to unit capacity min-cost circulation on the instance 


103 


T’ = (Gy, c, 1), we can either find a negative cost cycle in G, or determine there’s none in O(T'4(m))- 
time. There are O(log m) stages in binary search and each stage is done in O(T\4(m))-time using 
the given min-cost circulation algorithm A. Overall, we can find a negative cost cycle C* with 
maximum bottleneck in O(T.4(m) log m)-time. 

Let p(C*) be the flow vector corresponding to C*. We show that u(C*)p(C*) is a m!- 
approximate solution to (58) on instance Z = (G,c,u). Let f* be the optimal solution to (58). 
Decompose f* as a non-negative linear combination of edge-disjoint directed cycles in G, i.e. 


f* =) 0 ap(Ci), 
i=1 


where {Cj,...,Cm} is a edge disjoint collection of cycles in G, až is an non-negative coefficient, 
and p(C;) denotes the flow vector corresponding to cycle C; for i = 1,...,m. Due to optimality of 
f*, we can assume that a; > 0 only if C; is a negative cost cycle. 

Observe that až is at most u(C;), the bottleneck of C;. And the cost of each cycle is at least 
—m!!. Thus, we can bound the cost of f* by 


cl f*= 5 ažc' p(C;) > — 5 u(Ci)m!! > =m?u(C*), 
i:cT p(Ci)<0 i:e! p(Ci)<0 


where the last inequality comes from the definition of C* being the maximum bottleneck of all 
negative cost cycles in G. 

On the other hand, the weight of C* is at most —1 because costs are integers. Thus, the cost 
of u(C*)p(C*) is at most —u(C*) and the proof concludes. 


Given any B > 0, we can round edge capacities down to integral multiples of B/m?° and 
remove edges of capacity over mB. The optimal circulation for the rounded instance is feasible in 
the original instance. Furthermore, it is a constant approximation. First, let us define the rounded 
instance formally. 


Definition C.12. Given a graph G = (V,E) with costs c € {—m!°,...,m!°}" and capacities 
u € {1,...,U}* and a positive value B > 0. We define its capacity rounded graph G? = (V, EP) 
with costs c and capacities u” as follows: We include every arc e € E to GË and assign its capacity 
uP to be the nearest integral multiple of [B/m?°] below ue or [Bm] if ue > Bm”. 


Remark C.13. By scaling down the rounded capacities by [B/m?°], the capacity of every edge is a 
positive integer at most m*°. Thus, solving (58) on the capacity rounded instance TP = (G?,c, u?) 
is equivalent to solving on an instance with m*°-bounded capacities. In addition, one can recover 
the optimal solution for the capacity rounded instance by scaling up with [B/m?°], which is still 
integral. 


Lemma C.14. Given a graph G = (V, E) with costs c € {—m"9,...,m!9}" and capacities u € 
lees Ue and a positive value B > 0. Suppose that —B is m!*-approximation to the optimal 
value of (58) on the instance T = (G,c,u). Let fP be the optimal solution for (58) on the capacity 
rounded instance TË = (G?,c,u?). fP is an integral 1.1-approrimate solution for the original 
instance T = (G, c, u). 


Proof. Clearly, f? is a feasible solution for Z because f? < u” < u and GP? is a subgraph of G. 
Let f* be the optimal solution for (58) on the instance Z = (G, c, u). 
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One can decompose f* as a non-negative linear combination of edge-disjoint directed cycles in 
G, i.e. 


f* =>) ap(Ci), 
i=1 


where {C\,...,Cm} is a edge disjoint collection of cycles in G, až is an non-negative coefficient, 
and p(C;) denotes the flow vector corresponding to cycle C; for i = 1,...,m. Due to optimality of 
f*, we can assume that až > 0 only if C; is a negative cost cycle. Also, až is at most be bottleneck 
capacity of the cycle Cj, ie. až < mingec, Ue. 

First, we claim that f* < [Bm?°] for any arc e. Otherwise, there is a negative cost cycle C; in 
the decomposition with až > [Bm?°]. The cost of C; is at least —1 due to integral costs. In this 
case, we use the fact that —m!?B < c! f* and deduce 

T pe 
cT f* < —[Bm] < —Bm® < EF m — mic" f* <0, 
which leads to a contradiction. 

If B < 2m?°, we have u” = u because [B/m?°] = 1. In this case, f? is exactly f*. Otherwise, 
round down the cycle decomposition of f* to integral multiples of [B/m?°]. That is, we define 


where a; is the nearest integral multiple of | B/ m9] at most až. We have that @ < minecc, u? 
and hence f is a feasible solution for the capacity rounded instance. In addition, we have a; > 
až — [B/m?°] for any i. Using these facts, we have 


c' f= Se < Sr ate™p(C,) = Sip) 


i=1 i 


where (i) comes from @; > až — [B/m?°], (ii) comes from that any simple cycle has cost at least 


—m!!, (iii) comes from B > 2m” and [B/m?°] < 2B/m?°, and (iv) comes from c' f* < —B as 
—B is the value of a m!*-approximate solution. We conclude the proof by observing that 


a ft <e f8 Am ec" ft. 
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Algorithm 10: Capacity Scaling Scheme for Solving (58) 


1 procedure CAPACITYSCALING(G = (V, E),c € {—m?®,...,m!}", u € {1,...,U}*) 
2 | FO SO 
3 T + O(log U) 
4 for t = 0,...,T — 1 do 
5 Compute —zx < 0 to be the value of an m!?-approximate solution to (58) via 
Lemma C.11. // x >0. 
6 if x = 0 then 
7 | f is an optimal solution and we end the for loop here. 
8 Let G*(f), e(f), u*(f) be the capacity rounded graph (Definition C.12) of 
the residual graph G(f) with costs e(f™)and capacities u(f ®©). 
9 Solve (58) on G*(f), e(f), ut (Ff) 
10 Let Ay be the primal optimal. 
11 FED e fO +A; 


12 | Output fO 


Proof of Lemma C.10. Let f* be the optimal solution. For any t, f® is integral since the aug- 
menting circulation Ap is always integral (Remark C.13). Therefore, the optimal solution f* — f Q 
to the residual instance w.r.t. f is integral and have cost at most —1 or 0. 

Lemma C.14 states that the augmenting circulation Ay is always a 2-approximation to the 
residual instance. Thus, at any iteration t, we have 


eadar Self" — 79) <0. 


Using the definition of f+" and induction yield 


Sep age ee oi") =e Agee oF 2° T 
Since the costs and capacities are bounded by m!° and U, c! f* is at least —m!!U. After T = 
O(log m + log U) = O(log mU) iterations, we have 


yti _ 
may 1 
= 9 


Ooze (ft-Ff™)> oom 
Combining with the previous observation that c'(f* — f)) is either 0 or at most —1, we have 
that c! (f* — f™) =0 and hence f) is an optimal solution. 

Each iteration spends O(T'4(m) log m)-time for computing an m!*-approximate solution, plus 
O(T.4(m))-time for computing Ay on a capacity rounded instance, and O(m) for constructing 
instances and computing f+). Overall, the runtime is O(T.4(m) log mlog mU + mlog mlog mU). 


Now, we can prove Lemma C.1 by combining Lemma C.3 and Lemma C.10. 


Proof of Lemma C.1. As any instance of (8) can be reduced to (58) with linear overhead in the 
number of edges and poly(m) scaling on costs and capacities, the lemma follows directly from 
Lemma C.3 and Lemma C.10. 
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D Applications 


Our results directly imply faster running times for algorithms that invoke network flow primitives. 


Extensions of Theorem 1.1. Our main result can be generalized to take vertex capacities and 
costs by standard transformations (for lower capacities on vertices being zero, one can simple split 
each vertex v into Vin and Vout such that all in-going edges to v are incident to Vin and all out-going 
edges to Vout after the split, and then insert and edge (vin, Vout) with the desired capacity and 
cost). Further, we can generalize our algorithm to handle the flow diffusion problems [WFHMR17; 
FWY20; CPW21] where d € RY ,d'1 > 0, is considered a vertex capacity vector instead of a 
demand and one wants to find a flow f that satisfies B'f < d while minimizing over a cost 
function on f. This can be realized by adding special vertices s,t and an edge (s,v) (resp. (v,t)) 
for each vertex v € V where b, < 0 (resp. b, > 0), with lower capacity 0, upper capacity |b,| and 
cost 0. 

Previously, considerable effort [CK19; Chu21; BGS21] was directed towards obtaining approxi- 
mate max-flow algorithms that can handle vertex-capacities in undirected graphs where the above 
mentioned transformations do not translate. Diffusion has been considered for the cost function 
taken to be the @2-norm [HRW20; CPW21]. We recover using simple reductions a simple almost 
linear time algorithm that can handle a wide range of cost function. 

We can also obtain an algorithm that runs in near-linear time to compute p-norm flows, i.e. 
flow problems where one is given a weight matrix W and solves the problem mingr s-a || W FIZ 
up to a polynomially small error. An even more general problem is considered in Theorem 10.14. 
Previous work, either achieved super-linear run-time [AKPS19; ABKS21] or was only able to solve 
the problem when W was taken to be the identity matrix [KPSW19; AS20]. 


Bipartite Matching & Optimal Transport. Many popular variations of matching problems 
are well-known to be reducible to min-cost flow in bipartite graphs, i.e. graphs G = (V, E) where 
there is a partition Vi, V2 of V such that each edge has exactly one endpoint in V; and one in V2. 

In the standard matching problem, one is given the task of maximizing the number of edges 
without common vertex in an undirected graph. In the perfect matching problem, the algorithm 
has to output a matching of size |V|/2 or conclude that such a matching does not exist. A substan- 
tial generalization of perfect matching problems is the worker assignment problem: given upper 
capacities ut € RE, and costs c € RË over the edges and has b € N Y the goal is to either compute 
a weight w € N” such that each vertex v € V has edges of total weight b, incident and where c! 
is minimized over all such choices, or decide that no such weight w exists. Our result implies that 
the the worker assignment problem can be solved in time m!+°0) log? U in bipartite graphs. We 
refer the reader to [GT89a] for an in-depth description of the reduction to min-cost flow. 

Our result can further also be used to solve the optimal transportation problem, even with 
entropic regularization (see [DGK18; GHJ20]), which is crucial for applications in machine learning. 
In this problem, one is given a bipartite graph G = (V1 UV2, E), demand d, where d is non-negative 
on V, and non-positive on V2, costs c, and the goal is to find a flow f that satisfies B! f = d and 
minimizes c! f + H(f) where H(f) = Yer fe log(f-). We can use our result in Theorem 10.16 
to obtain the first almost-linear time algorithm to obtain an optimal flow f to high accuracy 
(also called transportation plan). This improves even over the run-time of O(n?) taken by current 
state-of-the-art low accuracy solvers [Cut13; BCCNP15; ANR17; DGK18]. Without the entropic 
regularization the problem is reducible directly to the worker assignment problem. 

The matrix scaling problem [ALOW17; CMTV17] asks: given a matrix A € R%j” with non- 
negative polynomially bounded entries, to compute positive diagonal matrices X,Y such that all 


w 
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row and column sums of XAY are 1. As shown in Section 10.2, the dual of the matrix scaling 
problem is optimal transport with entropic regularization. Hence we achieve an algorithm for 
solving matrix scaling to high accuracy in almost-linear time even when the entries of the matrices 
X, Y may be exponentially large. 


Negative Shortest-Paths and Cycle Detection. We obtain a almost linear time algorithm to 
compute the Single-Source Shortest Paths from a dedicated source vertex s in a directed, possibly 
negatively weighted graph by invoking Corollary 1.2 with costs set to edge weights, u™ = 0,ut = 
n-1 and d; =n and d, = —1 for all v € V. For a graph with weights bounded by W in absolute 
value, this gives an algorithm with running time m!+°™) log W. Further, we can find a negative 
directed cycle in a graph by choosing u- = d = 0, u* = 1, letting the cost vector equal the weights 
and check whether the computed flow f is non-zero. If it is not then f is a negative cost circulation 
and using Cut-Link Trees [ST83] on can recover a negative cycle. For both problems, we give the 
first almost linear time algorithm. 


Connectivity & Gomory-Hu Trees. Another family of classic combinatorial problems are 
connectivity problems where many reductions to maximum flow have been found during the last 
years. It is well-known that from a (s,t) maximum flow, i.e. the maximum amount of flow that 
can be sent in a unit-weighted graph from a vertex s to vertex t, one can find an (s,t) min-cut in 
almost linear time, that is a bipartition V1, V2 of the vertex set V of the graph with s € Vi,t € Vo 
such that the number of edges with tail in V; and head in V2 is minimized. 

Our algorithm implies an algorithm that finds the global min-cut obtained by miminizing over 
(s,t) cuts for all pairs s,t € V, in time mn!/2+°™) time in directed graphs [CLNPQS21]. For 
undirected graphs, using a reduction from [LP20], we obtain the first almost linear algorithm to 
compute a Steiner min-cut which is the minimum (s, t)-cut for s,t € S for a fixed input set S C V. 

Our result also implies the first m!+° time algorithm to compute a global vertex min-cut in 
undirected graphs via [LNPSY21], i.e. a tripartition A, B, S of V such that there is no edge from A 
to B where the size of S is minimized. It further gives m!+o)poly(k) time algorithm to construct 
a k-vertex connectivity oracle (see [PSY22]). 

Finally, we consider algorithms to compute Gomory-Hu trees that is a weighted tree T over the 
vertex set of G such that for any two vertices s,t € V, the (s,t) min-cut in G has the same value as 
in T. Our result gives the first m!+°™ time algorithm to compute Gomory-Hu trees in unweighted 
graphs (via [AKLPST21; Zha21]), or to a (1 + €)-approximation in weighted graphs (via [LP21]) 
for arbitrarily small constant e. 

We point out that we improve for all cited problems the run-time by polynomial factors (in m). 


Directed Expanders. We say a cut (S, V\S) in a digraph G is ¢-out-sparse if =z SS ay < 


@ where E(S,V \ S) is the set of edges with tail in S and head in V \ S and vol(X) is the sum of 
degrees of vertices in X in G. A graph G is called a ¢-expander if G allows no ¢-out-sparse cut. 
Applying our max-flow algorithm to a straightforward extension of the cut-matching game 
[KRV09; Loul0] gives a m!+°() time algorithm that given any graph G and parameter ¢ € 
(0, 1/O(log? m)], either outputs a O(¢ log? m)-out-sparse cut or certifies that G is a ¢-expander. The 


algorithm also works when a ¢-out-sparse cut is redefined to be a cut (S, V \ S) with wae < 


o. This improves over the previously best run-time of O(m/ $) for sparse graphs for a wide range 
of values for ¢. 

As a concrete application, we obtain a mn total time algorithm for the problem of 
maintaining strongly-connected graphs in a graph undergoing edge deletions that works against an 


0.5+0(1) 
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adaptive adversary (via [BGS20]), improving on the previously best time of mn?/3+°(), 


Isotonic Regression. Isotonic regression is a classic shape-constrained nonparametric regression 
method. The problem is formulated as follows: we are given a DAG (Directed Acyclic Graph) 
G = (V, E) and a vector y € RY. The goal is to find project y on to the space of vectors that are 
isotonic with respect to G. A vector æ € R” is said to be isotonic with respect to G if the embedding 
of V into R given by æ is weakly order-preserving with respect to the partial order described by 
G. The projection is usually computed using a weighted £p norm. This can be captured as the 
following convex program, ming ||W (æ — y)||,, subject to the constraints x; < æj for all (i, j) € E. 

We give an almost linear time algorithm for computing a 1/poly(n) additive approximate so- 
lution to Isotonic regression for all p € [1,0o). The previous best time bounds were O(m!5) for 
p € [1, 00) [KRS15], O(nm log 2 for p € (1,00) [HQ03], and O(nm+n? log n) for p = 1 [Sto13]. We 
stress that this running time is almost-linear in the number of edges in the underlying DAG, which 
could be significantly smaller than the number of edges in the transitive closure, which determines 
the running time of some algorithms [Sto21]. 
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